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J  This  Jr&poap'b  describes  the  application  and  evaluation  of 
several  statistical  models  in  the  forecasting  of  cloud  amount 
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and  ceiling  over  s-el ecte d~  phy s i'c a  1-Iy - homoge ne otrs~ "areas  of 
the  North  Atlantic  Ocean.  The  focus  of  this  study  is  to 
evaluate  the  applicability  of  previous  Naval  Postgraduate 
School  model  output  statistics  research  in  the  area  of 
horizontal  marine  visibility  to  the  forecasting  of  cloud 
amount  and  ceiling  over  ocean  areas.  The  models,  including 
minimum  probable  error  linear  regression  threshold  techniques 
maximum  conditional  probability  and  natural  regression,  uti¬ 
lize  observed  visibility  data  and  model  output  parameters 
from  the  Navy  Operational  Global  Atmospheric  Prediction 
System  (NOGAPS) .  Results  show  statistically  similar  results 
for  the  linear  regression  and  maximum  conditional  probability 
models.  Also  included  is  the  result  of  additional  experi¬ 
mentation  on  the  application  of  several  measures  of  separa¬ 
bility  and  cluster  analysis  to  predictor  selection. 
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IV.  PROCEDURES 


A.  TERMS  AND  SYMBOLS 

The  terms  and  statistical  symbols  defined  below  will 
be  used  throughout  the  remainder  of  this  report.  The  formal 
mathematical  definitions  are  described  in  Karl  (1984). 

1.  Maximum  probability  strategy — choosing  the  forecast 
weather  element  (e.g.,  cloud  amount  or  ceiling) 
category  based  upon  the  highest  probability  of 

the  weather  element  within  a  predictor  interval, 
hence  conditional  probability. 

a.  MAXPROB  I — designation  of  the  maximum  probability 
strategy  in  which  ties  of  the  highest  conditional 
probabilities  in  a  predictor  interval  are 
resolved  by  the  generation  of  a  random  number. 

b.  MAXPROB  II — designation  of  the  maximum  probability 
strategy  in  which  ties  of  the  highest  conditional 
probabilities  in  a  predictor  interval  are 
resolved  by  assigning  the  lowest  element  category, 
of  those  tied,  as  the  forecast  category. 

2.  Natural  regression  strategy — choosing  weather  cate¬ 
gories  based  upon  the  statistical  average  of  the 
conditional  probabilities  of  the  weather  element 
within  a  predictor  interval. 

3.  AO  —  the  probability  of  a  zero-class  weather  element 
category  forecast  error  (e.g.,  if  cloud  amount 
category  I  is  forecast  and  observed) .  This  is  more 
generally  known  as  total  percentage  correct. 

4.  A1 — the  probability  of  a  one-class  weather  element 
category  forecast  error  (e.g.,  if  ceiling  category 
I  is  forecast  and  category  II  is  observed) . 

5.  A2 — the  probability  of  a  two-class  weather  element 
category  forecast  error  (e.g.,  if  ceiling  category 
I  is  forecast  and  category  III  is  observed) . 

6.  CE--class  error  parameter  defined  as  A1+2A2,  used  as 
the  primary  aid  in  identifying  the  first  predictor 
for  the  Preisendorf er  strategies. 
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D.  TRAINING/TESTING  DATA  SETS 

One-third  of  the  observations  were  withheld  from  the 
developmental  model  to  use  as  an  independent  data  set  (the 
tesumg  set)  .  This  was  accomplished  by  the  use  of  a  counter 
and  transfer  statement  in  the  computer  programs  which  pre¬ 
vented  every  third  observation  from  entering  the  develop¬ 
mental  computations.  Although  the  approach  has  the  advantage 
of  simplicity,  there  could  be  some  sort  of  ordering  in  the 
data  base,  hence  the  split  runs  a  chance  of  being  non-random. 
To  ensure  that  the  dependent  (the  training  data  set)  and 
independent  (the  testing  data  set)  data  were  representative 
of  the  same  population,  a  95%  confidence  interval  for  propor¬ 
tions  (Miller  and  Freund,  1977)  was  established  from  the 
entire  data  set,  for  each  of  the  weather  element  categories; 
the  training  and  testing  data  sets  were  constrained  to  have 
frequencies  of  occurrence  within  these  established  confidence 
intervals . 
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co-located  with  the  National  Climatic  Data  Center  (NCDC) . 

The  observations  which  were  obviously  erroneous,  as  deter¬ 
mined  from  the  data  quality  indicators  provided  with  the 
data,  were  deleted  from  the  working  data  sets. 

4 .  Predictor  Parameters 

Fifty  TAU-00 ,  fifty-four  TAU-24  and  fifty-four  TAU-48 
model  output  predictors  (MOP's)  were  provided  by  the  Fleet 
Numerical  Oceanography  Center  (FNOC) ,  Monterey,  California. 
These  parameters  are  generated  by  their  current  operational 
atmospheric  prediction  model,  the  Navy  Operational  Global 
Atmospheric  Prediction  System  (NOGAPS).  All  MOP's  were 
interpolated  from  model  grid  coordinates  to  synoptic  ship 
report  position  using  a  linear  interpolation  scheme.  In 
addition  to  the  initial  group  of  model  output  parameters, 
ten  derived  parameters  representing  calculated  quantities, 
such  as  parameter  gradients,  products  and  advections,  were 
included  as  potential  predictors.  A  listing  of  all  avail¬ 
able  TAU-00,  TAU-24  and  TAU-48  MOP's  are  included  in 
Appendix  D. 

For  each  homogeneous  area  and  model  forecast  projec¬ 
tion,  a  set  of  two  linear  regression  equations,  in  addition 
to  the  aforementioned  MOP's,  were  included  as  potential 
MOP's  for  a  separate  evaluation  of  the  Preisendorf er 
methodology  (the  PR+BMD  model).  These  two  predictor 
equations  were  obtained  from  a  standardized  linear  regression 
software  package,  P9R,  an  all  possible  subsets  regression, 
as  addressed  in  the  BMDP  Statistical  Software  (University  of 
California,  1983). 
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2 .  Time  Period 

Data  from  mid-May  1983  to  mid-July  1983  were  combined 
to  form  a  more  extensive  data  set,  hereafter  referred  to 
as  FATJUNE  1983.  FATJUNE  1983  was  selected  as  the  initial 
data  set  for  the  visibility  studies  due  to  its  high  frequency 
of  occurrenceof  poor  visibility  observations,  and  it  was 
chosen  for  this  study  to  maintain  continuity  on  the  overall 
MOS  project.  1200  GMT  synoptic  ship  report  data  were  used 
exclusively  in  this  study  since  1200  GMT  corresponds  to  general 
daylight  conditions  over  the  North  Atlantic  Ocean  during 
FATJUNE.  For  the  purpose  of  this  study,  TAU-00  model  output 
parameters  (MOP)  generally  represent  six-hour  model  forecasts 
valid  at  1200  GMT.  However,  three  specific  fields,  namely 
temperature,  geopotential  height  and  wind,  are  model  initiali¬ 
zation  fields  at  1200  GMT.  TAU-24  and  TAU-48  MOP's  are  24- 
hour  and  48-hour  model  forecasts,  respectively,  valid  at  1200 


GMT.  TAU-00,  TAU-24  and  TAU-48  MOP's  (predictors)  are 
employed  in  the  00- ,  24-  and  48-h  forecast  schemes,  respectively. 
Summaries  of  the  cloud  amount  and  ceiling  frequencies  for 


each  category  type,  as  a  function  of  homogeneous  area  and 
prediction  time  for  FATJUNE  1983,  are  contained  in  Tables  I 
through  IV,  respectively. 

3 .  Synoptic  Weather  Reports 

All  synoptic  weather  observations  (predictand  data) 
for  this  study  were  provided  by  the  Naval  Oceanography 
Command  Detachment  (NOCD) ,  Asheville,  North  Carolina  which  is 
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Ceiling  Category 

Code 

Definition 

I 

0-3 

<  1000  feet 

II 

4-5 

1000-3500  feet 

III 

6-9 

>  3500  feet 

The  above  scheme  is  based  on  U.S.  Navy  operational 
criteria : 

1.  Ceiling  less  than  1000  feet--U.S.  Navy  aircraft 
carrier  at-sea  flight  recovery  operations  require 
controlled  (IFR)  approach  guidelines  (Department 
of  the  Navy,  1979). 

2.  Ceiling  1000-3000  feet--f light  recovery  operations 
require  modified  IFR  approach  guidelines. 

3.  Ceiling  greater  than  3000  feet--at-sea  recovery 
operations  change  to  visual  (VFR)  approach  guidelines 

C .  NORTH  ATLANTIC  OCEAN  DATA 
1 .  Area 

The  North  Atlantic  Ocean,  from  0°  to  80°N  latitude, 
was  divided  into  homogeneous  oceanic  areas  following  Lowe 
(1984b),  using  a  statistical  cluster  analysis  technique. 

The  specific  homogeneous  areas  evaluated  in  this  study  are 
identified  as  areas  2  and  4  on  Fig.  2.  These  areas  were 
selected  because  they  contain  the  largest  data  samples  and 
represent  two  different  relative  frequencies  of  cloud  cover 
Area  4  represents  an  area  where  the  three  categories  are  of 
near  equal  population  while  area  2  represents  an  area  where 
the  number  of  category  I  (clear  and  scattered)  observations 
is  about  one-half  of  those  in  categories  two  and  three 
(broken  and  overcast) . 
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as  thin  or  partial.  However,  the  synoptic  Surface  Marine 
Observations  data  set  does  not  give  a  direct  observation  of 
ceiling  height,  thus  making  it  necessary  for  the  purpose  of 
this  study  to  synthesize  ceiling  height  from  the  data  that 
are  given.  The  data  set  gives  the  following  reported  fields: 

CLAMT:  Cloud  amount  or  total  sky  cover 

LOAMT:  Total  sky  cover  by  low  clouds  (middle  clouds 

if  no  low  clouds  are  present) 

CLHT :  Height  of  the  lowest  clouds  irrespective 

of  amount. 

Cloud  height  is  reported  with  a  synoptic  code  from  0  to  9 , 
where  0  refers  to  heights  from  0  to  50  feet  and  9  refers 
to  heights  greater  than  6500  feet  or  cases  where  no  clouds 
are  present.  This  6500  foot  height,  corresponds  roughly  to 
the  upper  boundary  of  the  clouds  reported  in  the  low  cloud 
amount  field,  LOAMT,  making  possible  the  following  definition 
of  ceiling  for  calculational  purposes  in  this  study: 

If  the  reported  LOAMT  <  5/8  then  the  ceiling  is 

unlimited . 

If  the  reported  LOAMT  _>  5/8  then  ceiling  is  taken 

as  the  reported  cloud  height. 

The  ceiling  observations  are  likewise  treated  as 
categorized  predictands  and  are  divided  into  the  following 
categories  for  prediction  purposes: 


Ill . 


DATA 


A.  CLOUD  AMOUNT  OBSERVATIONS  AND  SYNOPTIC  CODES 

Cloud  amount  is  defined  as  the  fraction  of  the  celestial 
dome  covered  by  all  clouds .  The  observations  taken  from 
seagoing  platforms  are  reported  as  values  of  zero  to  eight 
oktas  (eighths)  such  that  0  means  no  clouds,  1  means  l/8th 
cloud  cover,  etc.  In  addition,  9  is  used  to  report  an 
obscured  sky  (e.g.,  smoke,  fog),  for  which  a  defined  cloud 
cover  is  not  observable.  The  observations  were  treated  as 
categorized  predictands  and  were  divided  into  categories 
conforming  to  the  standard  definitions  of  opaque  sky  cover 
for  clear,  scattered,  broken  and  overcast,  as  used  for 
aviation  observations. 

Eighths  Definition 

0-4  clear/scattered 

5-7  broken 

8  overcast 

The  obscured  observation  was  not  used  in  the  MOS  development 
reported  on  here . 

B.  CEILING  OBSERVATIONS  AND  SYNOPTIC  CODES 

The  definition  of  ceiling  is  the  height  ascribed  to  the 
lowest  layer  of  clouds  or  obscuring  phenomena  when  it  is 
reported  as  broken,  overcast,  or  obscured  and  not  classified 


II.  OBJECTIVES  AND  APPROACH 


The  objective  of  this  study  is  to  extend  the  previous 
NPS  research  for  predicting  horizontal  marine  visibility 
using  model  output  statistics  (MOS)  (Karl,  1984;  Diunizio, 
1984)  to  the  prediction  of  cloud  ceiling  and  cloud  cover 
over  coastal  and  open  ocean  areas  of  the  North  Atlantic 
Ocean.  The  approach  to  the  problem  is  as  follows: 

A.  Define  categorical  groupings  of  cloud  amount  and 
ceiling  height  which  relate  to  operational  use 
at  sea. 

B.  Determine  if  one  element  is  important  to  the 
prediction  of  the  other  and,  therefore,  should 
be  investigated  first. 

C.  Apply  the  previously  investigated  methods  for 
forecasting  visibility  to  cloud  amount  and  ceiling, 
and  evaluate  their  performance.  These  methods 
include  Preisendorfer  (1983  a,b,c)  maximum  proba¬ 
bility  and  natural  regression  strategies,  and  linear 
regression  threshold  models  as  proposed  by  Lowe 
(1984a). 

D.  Compare  and  contrast  the  results  of  the  two  methodolo¬ 
gies  in  C,  above,  and  conduct  some  experimentation 

to  improve  their  applicability  to  cloud  cover  and 
ceiling  prediction. 

E.  Investigate  alternative  predictor  selection  schemes 
to  improve  the  ability  of  the  MOS  models  to  distin¬ 
guish  between  the  predictand  categories. 

F.  Make  recommendations  on  the  usefulness  of  the 
schemes  investigated  and  potential  avenues  for 
future  work  in  this  area. 
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range  of  atmospheric  visibility.  Secondly,  the  number  of 
observed  weather  reports  are  not  only  limited  in  number,  but 
also  come  from  moving  platforms  so  that  there  is  a  lack  of 
weather  trend  information  for  a  single  (fixed)  station  as 
is  the  case  in  land  observations.  Statistical  methodologies 
tested  by  Karl  (1984)  and  Diunizio  (1984)  to  overcome  the 
at-sea  MOS  problems,  include  a  conditional  probability 
approach  proposed  by  Preisendorf er  (1983  a,b,c)  and  various 
innovative  threshold  techniques,  as  applied  to  the  linear 
regression  model,  developed  by  Lowe  (1984a). 

This  study  represents  a  continuation  of  the  North  Atlantic 
Ocean  MOS  studies  on  visibility,  by  Karl  (1984)  and  Diunizio 
(1984).  However,  in  this  case,  the  statistical  methods 
tested  by  the  earlier  visibility  studies  are  applied  to  cloud 
amount  and  ceiling.  The  methods  used  here  have  been  designed 
to  be  consistent  with  those  of  the  previous  studies  in  order 
to  allow  for  comparison  of  results,  as  appropriate. 


Prediction  Research  Facility  (NEPERF)  in  Monterey,  California 
sponsored  a  limited  amount  of  research  into  nava.1  applications 
of  MOS ,  with  most  of  the  effort  going  toward  marine  visi¬ 
bility  and  fog.  The  results  of  these  Navy  studies  and  the 
encouraging  performances  of  the  NWS  and  AWS  MOS  programs 
prompted  the  Navy,  in  the  spring  of  1983,  to  begin  develop¬ 
ment  of  a  MOS  program  under  the  guidance  of  NEPERF,  to 
forecast  operational  air/ocean  parameters  over  the  oceans  of 
the  world.  The  proposed  milestones  of  this  ten  year  project 
are  summarized  in  Fig.  1.  The  first  operational  weather 
parameter  investigated  in  the  program  is  horizontal  visi¬ 
bility  over  the  North  Atlantic  Ocean  using  MOP's  from  the 
Navy  Operational  Global  Atmospheric  Prediction  System  (NOGAPS) , 
a  dynamical  primitive  equation  (PE)  model  run  operationally 
at  the  Fleet  Numerical  Oceanography  Center  (FNOC)  (Karl,  1984; 
Diunizio,  1984) . 

Previous  experimental  work  by  the  Navy  to  forecast 
open-ocean  fog  and  visibility  using  linear  regression 
equations  (Aldinger,  1979;  Yavorsky,  1980;  Selsor,  1980; 

Koziara  et  al .  1983;  Renard  and  Thompson,  1984)  shows  skill 
of  marginal  operational  usefulness  but  exceeding  that  of 
persistence  and/or  climatology.  Two  factors  limit  the 
potential  for  MOS  forecasts  of  visiblity  and  fog  at  sea. 

First,  there  is  the  lack  of  'calibrated'  fog  and  visiblity 
observations  in  that  shipboard  weather  observers  lack  suffi¬ 
cient  reference  points  to  be  able  to  accurately  estimate  the 


assembled  for  a  land  station  or  region  for  a  period  of 
several  years,  stratified  by  season  or  month. 

The  National  Weather  Service  (NWS)  has  included  MOS  as 
an  integral  part  of  their  weather  forecasting  operations 
since  the  mid  1970's  and  currently  forecasts  for  approximately 
15  weather  elements  at  forecasting  times  of  6  to  48  hours. 

These  MOS  forecast  equations,  developed  by  the  National  Oceanic 
and  Atmospheric  Administration's  (NOAA)  Techniques  Development 
Laboratory  (TDL) ,  are  based  on  model  output  parameters 
(MOP's)  from  the  U.S.  regional  model,  LFM-II.  In  December 
of  1980,  the  Air  Force  Air  Weather  Service  (AWS )  also  imple¬ 
mented  and  operated  a  MOS  forecasting  scheme  at  the  Air 
Force  Global  Weather  Center  ( AFGWC ) ,  Offut  AFB ,  Nebraska 
(Best  and  Pryor,  1983)  for  approximately  18  months.  The 
program  was  terminated  with  the  decision  to  replace  their 
hemispheric  primitive  equation  model  with  a  spectral  global 
dynamic  model  (Klein,  1981) .  The  linear  regression  tech¬ 
niques  used  by  both  the  Air  Force  and  NWS  has  demonstrated 
operationally  useful  skill  in  forecasting  weather  elements 
at  locations  over  land  throughout  the  world  (Best  and  Pryor, 
1983).  In  this  technique,  called  Regression  Estimated  Event 
Probability  (REEP) ,  predictor  variables  are  discretized  into 
sets  of  dummy  variables  prior  to  regression. 

The  Navy's  unique  responsibilities  of  marine  forecasting 
provides  a  motive  for  it  to  have  its  own  MOS  system.  In 
the  late  1970 's  and  early  1980 's,  the  Naval  Environmental 
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INTRODUCTION  AND  BACKGROUND 


Numerical  weather  prediction  models  have  made  great 
progress  in  forecasting  the  basic  meteorological  variables 
and  fields  on  the  synoptic  scale,  such  as  sea-level  pressure, 
wind,  and  moisture  (e.g.,  relative  humidity).  However, 
dynamic  models  have  had  little  success  in  predicting  sensi¬ 
ble  weather  variables  at  the  regional/local  scale,  and  in 
fact  most  models  do  not  forecast  many  of  these  variables 
directly.  Stochastic-dynamic  prediction  is  being  explored; 
it  shows  promise  for  operational  use  sometime  in  the  future, 
but  it  awaits  much  further  development  and  more  powerful 
computers . 

One  of  the  most  significant  developments  in  weather  pre¬ 
diction  is  the  combination  of  dynamical  and  statistical 
methods,  known  as  model  output  statistics  (MOS) .  The  MOS 
technique  is  the  determination  of  a  statistical  relationship 
between  a  weather  element  of  interest  (e.g.,  visiblity, 
ceiling,  precipitation)  and  a  large  menu  of  parameters  output 
from  an  operational  numerical  prediction  model  (e.g.,  boundary 
layer  wind,  constant-pressure  height,  temperature).  In  the 
case  of  the  National  Weather  Service,  the  operational  MOS 
technique  is  based  on  multiple  linear  regression,  where  the 
prediction  equations  are  developed  from  forecast  model 
parameters  (predictors)  and  observed  weather  (predictands) 


Contingency  table  results  for  the  area  2,  TAU-00 , 
single-stage  regression,  EVAR  model  for  ceiling  -- 

Contingency  table  results  for  the  area  2,  TAU-00, 
single-stage  regression,  MLDC  model  for  ceiling  — 

Confidence  intervals  for  significance  with  respect 
to  baseline — area  2,  TAU-00,  ceiling  - 

Skill  diagram  and  contingency  table  results  for 
ceiling  area  2,  TAU-00  (PR+BMD  model)  for  the 
(a)  MAXPROB  I,  (b)  MAXPROB  II,  and  (c)  natural 
regression  strategies  - 


Contingency  table  results  for  the  area  2,  TAU-00, 
single-stage  regression,  using  cloud  amount  as  a 
predictor  in  the  EVAR  model  for  ceiling  - 


Contingency  table  results  for  the  area  2,  TAU-00, 
single-stage  regression,  using  cloud  amount  as  a 
predictor  in  the  QUAD  model  for  ceiling  - 

Skill  diagram  and  contingency  table  results  for 
ceiling  area  2,  TAU-00  (PR+BMD  model)  using  cloud 
amount  as  a  predictor  for  the  (a)  MAXPROB  I, 

(b)  MAXPROB  II,  and  (c)  natural  regression 
strategies  - 


26.  Skill  diagram  and  contingency  table  results  for 

cloud  amount  area  2,  TAU-24  (PR+BMD  model)  for  the 


(a)  MAXPROB  I,  (b)  MAXPROB  II,  and  (c)  natural 
regression  strategies  -  152 

27.  Confidence  intervals  for  significance  with  respect 

to  chance — area  2,  TAU-48,  cloud  amount  -  155 

28.  Contingency  table  results  for  the  area  2,  TAU-48, 
single-stage  regression,  EVAR  model  for  cloud 

amount - 156 

29.  Contingency  table  results  for  the  area  2,  TAU-48, 
single-stage  regression,  MLDC  model  for  cloud 

amount - 157 

30.  Confidence  intervals  for  significance  with  respect 

to  baseline — area  2,  TAU-48,  cloud  amount  -  158 


31.  Skill  diagram  and  contingency  table  results  for 

cloud  amount  area  2,  TAU-48  (PR  model)  for  the 
(a)  MAXPROB  I,  (b)  MAXPROB  II,  and  (c)  natural 
regression  strategies  -  159 

32.  Skill  diagram  and  contingency  table  results  for 

cloud  amount  area  2,  TAU-48  (PR+BMD  model)  for  the 
(a)  MAXPROB  I,  (b)  MAXPROB  II,  and  (c)  natural 
regression  strategies  -  162 

33.  Contingency  table  results  for  the  area  2,  TAU-UO , 
single-stage  regression;  predictors  chosen  by 
highest  measures  of  separability  for  category  I 

versus  II  for  cloud  amount -  165 

34.  Contingency  table  results  for  the  area  2,  TAU-00 , 

single-stage  regression;  predictors  chosen  by 
highest  measures  of  separability  for  category  II 
versus  III  for  cloud  amount -  166 

35.  Contingency  table  results  for  the  area  2,  TAU-00, 

single-stage  regression;  predictors  chosen  by 
combination  of  clustering  and  separability  tech¬ 
niques  for  cloud  amount -  167 

36.  Contingency  table  results  for  the  area  2,  TAU-00, 
two-stage  regression,  predictors  chosen  by 
combination  of  clustering  and  separability  tech¬ 


niques  for  cloud  amount,  (a)  EVAR  and  (b)  QUAD  -  168 

37.  Confidence  intervals  for  significance  with  respect 

to  chance — area  2,  TAU-00,  ceiling -  170 
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7.  PP--the  potential  predictability  of  the  weather 

element  by  any  given  predictor.  Potential  predicta¬ 
bility  of  a  predictand/predictor  pair  is  defined  by 
Karl  (1984)  as 


m  n 

PP ( 2 1 1 )  =  n/(n-l)  l  P1(i)[  l  (P  , ( j | i )  -  1/np] 

i=l  j=l 


where : 

P^(i)  =  the  marginal  probability  of  a 

predictor; 

?2 1  (  j  I  i )  =  t^ie  conditional  probability  of  the 

jth  predictand,  given  the  ith 
predictor . 


8.  EPI--equally  populous  interval  used  to  discretize 
the  predictors  (i.e.,  subintervals  of  equal  popula¬ 
tion  size  based  on  the  predictor  range  of  values) . 

9.  Functional  dependence--a  measure  of  the  stochastic 
dependence  of  one  predictor  upon  another.  Functional 
dependence  is  the  probability  that  one  of  the  predic¬ 
tors  will  change  when  the  other  changes.  High  func¬ 
tional  dependence  values  between  one  already  selected 
predictor  and  another  potential  predictor  indicates 
that  little  additional  information  beyond  the  first 
selected  predictor  is  possible.  Conversely,  a  low 
functional  dependence  value  between  the  same  two 
predictors,  indicates  that  each  predictor  possesses 
distinct  information  about  the  predictand.  Functional 
dependence  range  is  0.0  to  1.0  (1.0  =  highest  func¬ 
tional  dependence) .  The  specific  derivation  and 
mathematical  description  of  the  concept  of  "functional 
dependence"  is  discussed  in  greater  depth  by 
Preisendorfer  (1983c). 

10.  Root-sum- squared  functional  dependence--the  functional 
dependence  of  a  predictor  on  all  predictors  already 
included  in  the  developmental  model.  It  is  equal 

to  the  square-root  of  the  sum  of  the  squares  of  the 
individual  functional  dependence  values. 

11.  TS1,  TS2,  TS3--threat  score  for  weather  element 
category  I,  II  and  III,  respectively,  computed  from 
a  contingency  table  (see  Appendix  E) . 
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B.  COMPUTER  PROGRAMS 


Four  computer  programs  were  developed  by  Karl  (1983)  to  test 
the  proposed  Preisendorf er  (1983  a,b,c)  methodology  for  forecast¬ 
ing  visibility.  These  programs  were  rewritten  to  allow  them  to 
be  applied  to  cloud  amount  and  ceiling  forecasting  and  are  on 
file  in  the  Department  of  Meteorology,  Naval  Postgraduate  School, 
Monterey,  California,  93943. 

1.  A  program  to  compute  AO,  Al,  CE  and  PP  for  all  predic¬ 
tors,  all  strategies  (MAXPROB  I,  MAXPROB  II  and  natural 
regression)  for  a  particular  number  of  equally  populous 
predictor  intervals.  Statistics  for  the  three  strate¬ 
gies  are  based  upon  the  predictor  (s)  that  proved 
optimal  for  each  strategy. 

2.  A  program  to  compute  functional  dependence  for  all  predic¬ 
tors,  on  a  given  predictor,  for  a  given  number  of  equally 
populous  intervals  and  to  compute  the  associated  96% 
critical  confidence  interval  value  (referred  to  as  func¬ 
tional  dependence (96)  in  this  study)  by  Monte  Carlo  means. 

3.  A  program  to  construct  contingency  tables  and  to  compute 
skill  and  threat  scores,  for  both  the  testing  and  training 
data . 

4.  A  program  to  generate  100  random  data  sets,  from  the 
marginal  probabilities  of  the  predictor (s)  in  the 
developmental  model,  and  to  compute  upper  and  lower 
5%  critical  confidence  interval  values  for  AO  and  Al 
to  be  used  for  testing  the  significance  of  the  results 
for  each  of  the  Preisendorf er  models  against  chance. 

These  confidence  interval  values  are  calculated  via 
Monte  Carlo  means.  This  study  developed  another  testing 
standard  derived  as  a  consequence  of  the  central  limit 
theorem.  It  is  used  in  the  results  section  to  discuss 
the  significance  of  the  results  of  each  of  the  models 
used,  and  is  presented  later  in  this  chapter. 

A  second  set  of  programs  was  used  to  develop  the  regres¬ 
sion  equations  taken  mainly  from  the  BMDP  STatistical  Soft¬ 
ware  Package  (University  of  California,  1983). 

1.  BMDP  P9R.  An  All  Possible  Subsets  Regression  program 
used  to  initially  select  predictors  beginning  with 
a  general  screening  of  the  entire  set  of  potential 
predictors . 


BMDP  PlR .  A  straight  regression  program  to  develop 
the  prediction  equation  using  the  variables  selected 
by  the  P9R. 


3.  BMDP  P5D.  This  program  takes  the  developed  prediction 
equation  and  produces  histograms  of  the  data  set 
divided  into  the  prediction  categories . 

4.  A  program  to  generate  the  thresholds  used  with  the 
regression  equations.  (These  will  be  discussed  in 
more  detail  later  in  this  chapter.) 

5.  A  program  was  developed  to  construct  contingency 
tables  of  skill  and  threat  scores  from  the  regression 
equation  experiments  for  both  the  training  (dependent) 
and  testing  (independent)  data  sets. 

C .  MODELS 

1 .  Preisendorfer  PR  Model 

This  model  represents  the  first  of  two  different 
applications  of  the  basic  Preisendorfer  methodology 
(Preisendorfer,  1983  a,b,c).  Karl  (1984),  in  his  preliminary 
research,  provides  a  rigorous  interpretation  and  results 
associated  with  this  approach.  Karl's  study  provides  the 
necessary  background  for  the  continuing  MOS  studies  using 
this  model.  This  material  will  not  be  repeated  here. 

The  PR  model  utilizes  the  working  set  of  NOGAPS  model 
output  parameters  (MOP's)  and  derived  parameters  (Appendix  D) 
as  potential  predictors  in  constructing  a  developmental  model, 
based  upon  the  training  data  set,  which  provides  the  struc¬ 
ture  by  which  the  testing  data  set  is  tested  and  evaluated. 

In  general,  these  potential  predictors  have  their  range  of 
values  partitioned  into  discretized  equally  populous  predic¬ 
tor  intervals  ("cells"),  and  conditional  probabilities  of 


the  predictand  are  calculated  according  to  the  three  cate¬ 
gories  for  cloud  amount  and  ceiling,  specified  in  Chapter 
III.  Three  separate  strategies, for  determining  the  specific 
category  which  is  to  be  identified  with  each  predictor  value, 
are  proposed.  These  strategies,  two  based  upon  maximum 
probability  and  the  third  based  on  a  natural  regression 
approach,  are  addressed  as  MAXPROB  I,  MAXPROB  II  and  natural 
regression  (NATR)  in  the  remaining  portions  of  the  study. 

Initial  evaluation  of  this  model  involves  varying 
the  equally  populous  predictor  intervals  from  sizes  of  four 
through  ten,  and  selecting  an  optimal  first  predictor  which 
provides  one  of  the  following  requirements  in  the  designated 
order: 

a.  the  lowest  CE  value  of  all  the  potential  predictors; 

b.  the  highest  PP  value  of  all  the  potential  predictors. 

Once  a  first  predictor  is  identified  for  each  of  the 
four  through  ten  equally  populous  predictor  intervals, 
corresponding  category  I,  II  and  III  threat  and  AO  skill 
scores  (Appendix  E)  are  calculated  for  both  the  dependent 
and  independent  data  sets .  The  practice  of  selecting  an 
optimal  equally  populous  predictor  interval  (optimal  in  the 
sense  of  maximizing  AO)  from  the  eligible  grouping  sizes  of 
four  through  ten,  was  proposed  by  Karl  (1984)  and  used  by 
Diunizio  (1984)  as  a  practical  procedure  which  would  permit 
the  realization  of  peak  skill  scores  as  well  as  maintain 
associated  computer  storage  requirements  at  a  manageable 
level.  An  unfortunate  consequence  of  this  range  of  potential 
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grouping  sizes  is  that  certain  statistical  calculations  asso¬ 
ciated  with  equally  populous  predictor  intervals  of  eight, 
nine  and  ten  are  terminated  before  completion  due  to  a  two 
mega-byte  storage  ceiling  at  the  NPS  W.R.  Church  Computer 
Center  (Diunizio,  1984).  When  considering  potential  pre¬ 
dictor  intervals,  the  size  of  the  interval  is  of  obvious 
importance,  with  lower  values  being  the  most  desirable.  In 
the  previous  studies  in  the  MOS  series  concerning  visibility, 
the  criterion  for  determining  the  optimal  equally  populous 
predictor  interval  was  to  select  the  smallest  interval  value 
which  maximized  the  dependent  data  set  AO  and  independent 
category  I  threat  score.  The  threat  score  for  category  I 
was  selected  for  this  purpose  because  it  was  felt  that  low 
visibility  (represented  by  category  I)  was  uniquely  important 
to  forecast.  In  dealing  with  cloud  amount  there  is  not  a 
single  category  that  is  obviously  most  important  to  fore¬ 
case,  and  therefore,  the  selection  of  the  interval  was  based 
only  on  the  maximized  dependent  AO .  This  interval  was  then 
fixed  for  all  ensuing  aspects  of  the  model  evaluate 
Consistent  with  the  findings  of  the  previous  studies,  t  . 
selection  criteria  are  based  on  the  MAXPROB  II  scores, 
hence  the  MAXPROB  I  and  natural  regression  strategies  play 
no  role  in  the  predictor  selection  scheme. 

Once  the  first  predictor  and  its  associated  equally 
populous  predictor  interval  have  been  identified,  a  functional 
dependent  test  of  the  first  predictor  against  the  remaining 


potential  predictors  is  run.  The  second,  third  and  al 


subsequent  predictors  are  selected  only  if  both  of  the 
following  criteria  are  met: 

a.  subsequent  predictors  must  increase  AO  over  the 
AO  value  attained  at  the  preceding  level,  and 

b.  the  selected  predictor  must  have  the  lowest  root- 
sum-square  functional  dependence  of  all  the 
remaining  potential  predictors. 

Significance  tests  were  run  on  the  developmental 
model  after  each  predictor  selection  stage  had  been  completed 
to  determine  if  the  results  were  suitably  significant  as 
compared  to  random  chance.  This  was  accomplished  using  the 
previously  mentioned  Monte  Carlo  method  generating  the  05 
and  96  percentile  confidence  intervals  using  100  randomly 
generated  data  sets.  Further  consideration  has  brought  out 
that  100  cases  may  not  be  a  large  enough  sample  size  for  the 
Monte  Carlo  test.  For  this  reason  a  testing  technique, 
derived  as  a  consequence  of  the  central  limit  theorem  more 
fully  described  at  the  end  of  this  chapter,  was  applied  to 
the  results  at  the  end  of  each  run  to  demonstrate  that  the 
results  are  significant  in  relation  to  chance. 

The  model  development  continues  along  these  criteria 
until  computer  storage  limitations  preclude  further  addition 
of  parameters.  This  generally  occurred  in  previous  studies, 
and  in  every  case  in  this  study,  at  the  fifth  predictor 
level.  Once  the  developmental  model  is  completed,  contingency 
tables  of  the  forecast  element  category  versus  the  observed 
element  category  are  constructed  for  both  the  dependent  and 
independent  data  sets,  and  threat  and  skill  scores  are 
computed  and  compared. 
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Preisendorfer  PR+BMD  Model 


This  model  is  still  the  PR  model  described  above, 
but  now  sets  of  two  linear  regression  equations  are  added 
to  the  list  of  potential  predictors,  namely,  NOGAPS  MOP's 
and  derived  parameters. 

3 .  Linear  Regression  Models 

Linear  regression  represents  the  more  traditional 
approach  to  MOS .  Regression  Estimated  Event  Probability 
(REEP)  is  the  basis  for  the  National  Weather  Service  and 
Air  Weather  Service  regression  models.  In  this  study  two 
approaches  to  the  regression  model  are  explored,  a  single 
stage  and  a  two  stage,  and  three  threshold  algorithms  are 
used:  equal-variance,  quadratic,  and  a  modified  maximum- 

likelihood-decision-criteria.  The  procedures  are  outlined 
here,  but  a  more  detailed  explanation  of  the  theories  is 
given  in  Appendix  A. 

a.  Single  Stage  Regression 

This  model,  referred  to  in  the  tables  as  BMD  SS , 
consists  of  generating  a  single  linear  regression  equation 
trained  on  the  dependent  data  set,  with  the  predictand  set 
equal  to  1 ,  2  or  3 ,  corresponding  to  weather  element  cate¬ 
gories  I,  II  or  III,  respectively.  This  equation  is  then 
used  with  the  dependent  training  set  in  the  graphical  plotting 
program  BMD  P5D,  from  the  BMDP  Statistical  Software,  to 
generate  a  set  of  three  histograms  and  a  listing  of  the 
individual  frequency  of  observation  (P) ,  mean  (u) ,  and  standard 


deviation  (s)  of  each  of  the  three  predictand  distributions. 
These  statistics  are  then  used  in  the  threshold  algorithms 
to  calculate  two  threshold  values.  Finally,  the  regression 
equation  and  the  two  thresholds  are  used  to  process  the 
independent  data  to  obtain  a  set  of  the  observed  weather 
element  versus  the  forecasted  element  results  in  contingency 
table  format.  These  tables  and  their  calculated  threat 
scores  are  presented  in  Chapter  V  and  Appendix  I. 
b.  Two-Stage  Regression 

This  model,  referred  to  in  the  tables  as  BMD 
TS ,  is  based  on  a  decision-tree  scheme  using  two  linear 
regression  equations  trained  on  the  dependent  data.  The 
first  equation  is  generated  by  separating  the  largest  frequency 
category  from  the  other  two.  In  the  cases  of  cloud  amount 
and  ceiling  this  was  accomplished  by  setting  the  values  for 
category  I  and  II  to  1  and  the  values  for  category  III  to 
2  and  then  developing  a  regression  equation  and  threshold 
(as  in  the  single  stage  above)  to  suitably  describe  the  two 
distributions.  The  second  stage  regression  equation  and 
threshold  are  generated,  based  only  on  those  observations 
which  did  not  exceed  the  first  stage  threshold  value,  effec¬ 
tively  eliminating  cases  evaluated  by  the  first  stage  as 
being  category  III.  The  second  stage  is  thereby  a  separation 
of  category  I  from  category  II  observations.  In  other  words, 
the  first  stage  regression  separates  category  III  from  the 
combined  grouping  of  categories  I  and  II,  while  the  second 


stage  separates  the  remaining  category  II  from  category  I 
data.  The  two  resulting  equations  and  associated  thresholds 
are  then  applied  to  the  independnet  data  to  obtain  the 
forecast  versus  observed  contingency  tables  and  calculate 
the  threat  scores. 

c.  Threshold  Models 

The  equal  variance  model  (referred  to  as  EVAR) 
uses  an  algorithm  which  requires  the  assumption  that  the 
variances  of  the  two  normally  distributed  populations  which 
are  to  be  separated  by  a  threshold  are  equal,  while  their 
means  are  unequal.  The  quadratic  threshold  (referred  to  as 
QUAD)  algorithm  makes  no  assumptions  about  the  means  and 
variances,  but  does  take  into  consideration  group  apriori 
probability.  The  maximum  likelihood  decision  criteria 
(MLDC)  was  modified  for  use  as  a  third  threshold  model  in 
order  to  separate  the  categories  of  scattered  and  broken 
clouds,  categories  I  and  II  (Cooley,  1978),  historically  a 
difficult  task.  The  MLDC  is  not  based  on  apriori  group 
probabilities  but  requires  only  the  event  conditional  proba¬ 
bility  functions  of  the  observations,  and  is  useful  in  pre¬ 
dicting  events  of  rare  occurrence.  In  the  study,  the  MLDC 
threshold  model  consists  of  using  the  midpoint  between  the 
category  I  and  II  distribution  means  with  the  EVAR  thres¬ 
hold  between  the  category  II  and  III  distributions. 

D.  SIGNIFICANCE  TESTING 

The  results  of  the  experiments  are  tested  against  two 
standards  to  demonstrate  that  the  results  are  significant 


with  respect  to  chance  and  to  evaluate  improvement  over 
classical  MOS  modeling  methods. 

1 .  Significance  of  the  Skill  of  a  Forecast  versus  Chance 
This  first  test  of  the  results  is  based  on  the  pro¬ 
posal  that  both  percentage  correct  (AO)  and  threat  scores 
(TSl,  TS2 ,  TS3)  can  be  presented  as  probabilities  and  the 
fact  that  a  binomial  population,  if  large  enough,  can  be 
approximated  by  a  normal  distribution.  As  such  the  percentage 
correct  and  threat  scores  may  be  subjected  to  a  null  hypothe¬ 
sis  significance  test  derived  as  a  consequence  of  the  central 
limit  theorem.  The  actual  significance  testing  is  made  with 
respect  to  confidence  intervals  about  the  scores  which  would 
be  achieved  by  a  uniform  random  distribution  of  category 
I,  II  and  III  observations  in  a  3  *  3  contingency  table. 

These  scores  represent  the  scores  which  would  be  achieved  by 
pure  chance.  The  test  can  be  stated  that  the  null  hypothesis 
is 

P ( A  <  X  <  B)  =  .95 


with  lower  limit 


A  =  y/n  -  1 . 96 [y/n ( 1-y/n) /n] 

and  the  upper  limit 


B  =  y/n  +  1 . 9 6 [y/n ( 1-y/n) /n]  . 
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A  95%  confidence  interval  is  made  about  the  scores  expected 
if  each  category  had  an  equally  likely  chance  of  being  fore¬ 
casted.  If  the  procedure  score  lies  within  the  interval 
for  the  null  hypothesis  score  then  it  is  considered  that 
there  is  no  statistically  significant  difference  between  the 
two  scores.  The  contingency  tables  and  95%  confidence  inter¬ 
val  calculations  are  shown  in  Figs.  3,  13,  20,  27  and  37. 

2 .  Improvement  Over  Baseline 

Since  some  form  of  regression  is  the  traditional 
method  of  developing  MOS  models,  the  baseline  standard  for 
comparison  of  all  the  experiments  in  this  study  are  confi¬ 
dence  intervals  generated  using  the  results  from  the  single- 
stage  regression  for  each  area,  and  time  period.  These  95% 
confidence  intervals  are  made  using  the  same  equations  as 
the  test  for  significance  of  a  score  versus  chance.  Addi¬ 
tionally,  in  area  2,  the  TAU-00  baseline  is  used  to  evaluate 
degradation  of  the  results  with  time.  The  baseline  intervals 
are  shown  in  Figs.  10,  17,  24,  30  and  37. 

E.  MEASURES  OF  SEPARABILITY 

As  the  testing  proceeded  through  progressive  time  stages, 
it  became  more  apparent  that  the  methods  were  struggling  to 
separate  the  categories  of  scattered  and  broken  clouds, 
categories  I  and  II  (Cooley,  1978).  This  problem  required 
investigation  of  some  alternate  predictor  selection  schemes 
to  improve  the  ability  to  discriminate  between  these  cate¬ 
gories.  Two  approaches  of  determining  the  optimal  separation 
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between  the  categories  were  combined  and  then  applied  in 
a  brief  analysis  on  area  2,  TAU-00 .  These  methods  are 
termed  Class  Separability  Measures  and  Cluster  Analysis. 
Unfortunately,  time  did  not  permit  a  detailed  attempt  at 
using  these  two  methods,  but  the  results  from  area  2,  TAU-00 
are  included  in  this  study  and  show  sufficient  potential  to 
deserve  further  study. 

1 .  Class  Separability  Measures 

The  specific  separability  measures  used  were  the 
Bhattacharya  Distance,  the  Divergence,  and  the  Mahalanobis 
distance  (Hand,  1981),  each  of  which  is  discussed  in  more 
detail  in  Appendix  B.  These  measures  were  calculated  using 
the  means  and  variances,  and  in  the  case  of  the  Mahalanobis, 
pooled  variances  of  the  various  predictors  with  the  following 
univariate  form: 

Bhattacharya  Distance : 


Bh 


) 


Divergence : 


Div 


2) 


Mahalanobis : 


Mai 
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The  predictors  were  separated  by  classes.  In  this  case 
categories  I,  II  and  III,  and  then  the  three  measures  were 
calculated  for  category  I  versus  II,  category  II  versus 
III,  and  category  I  versus  III.  It  is  important  to  note 
that  the  variables  that  best  discriminate  between  group  I  and 
II  may  not  be  the  same  as  those  that  best  discriminate 
between  II  and  III  or  between  I  and  III.  In  each  case  the 
means  and  variances  of  the  predictors  were  scaled  from  0  to 
100  to  ease  number  handling  and  value  comparisons.  The 
calculated  distance  measures  are  listed  in  Tables  X,  XI  and 
XII  for  area  2  at  TAU-00 . 

2 .  Cluster  Analysis 

Cluster  Analysis  takes  a  sample  of  potential  pre¬ 
dictor  variables  of  unknown  classification  and  groups  those 
variables  into  natural  classes  or  "clusters."  The  method  is 
fundamentally  a  tool  for  data  exploration  to  determine  if 
natural  and  useful  groupings  do,  in  fact,  exist.  This  method 
was  applied  to  the  predictors  by  use  of  the  BMDP  Statistical 
Program,  P1M,  which  provides  four  measures  of  similarity  for 
clustering  variables  and  three  criteria  for  linking  or 
combining  clusters.  A  more  detailed  discussion  of  cluster 
analysis  is  also  found  in  Appendix  C.  In  general,  cluster¬ 
ing  was  used  to  determine  groupings  of  predictors  that  carry 
much  the  same  information  in  relation  to  the  predictand 
classes . 
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3 .  Experiments  Using  Separability  and  Clustering 


It  must  be  noted  that  time  did  not  permit  an  exten¬ 
sive  investigation  of  these  methods,  but  rather  only  a 
cursory  look  at  their  potential  for  usefulness.  The  basic 
method  consists  of  using  the  cluster  analysis  to  develop  groups 
of  variables  to  choose  from,  and  then  employs  the  separation 
measures  to  select  the  "best"  predictor  from  each  of  these 
clusters.  These  parameters  are  then  used  to  develop  a  linear 
regression  equation  to  predict  the  three  cloud  amount  cate¬ 
gories  in  the  same  manner  as  earlier  testing  in  this  study. 

Four  experiments  were  attempted: 

a.  A  single-stage  regression  using  the  variables  which 
had  relatively  high  separability  measure  values 

for  category  I  versus  II. 

b.  A  single-stage  regression  using  the  variables 
which  had  relatively  high  separability  measure 
values  for  category  II  versus  III. 

c.  A  single-stage  regression  using  the  predictors  with 
the  highest  separation  value  from  each  clustered 
group  of  predictors. 

A  two-stage  regression  using  separation  and 
•lustering  to  separate  category  I  from  II  and  III 
and  then  category  II  from  III. 

•  •  lEN'ERAL 

The  first  area  studied  was  cloud  amount  in  area  4,  TAU- 
'  <  ,  and  the  procedure  is  an  exact  application  of  the 
methodology  used  in  the  previous  MOS  studies  for  visibility 
Karl,  1934;  Diunizio,  1984).  The  one  exception  is  that  the 
linear  regression  model  is  tested  with  both  a  single-stage 
and  two-stage  regression  technique.  Next,  area  2  of  the 
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its  TS2  falls  below  the  baseline  significance.  Its  maximum 
AO  of  49.49  is  not  significantly  different  than  baseline  and 
is  attained  at  the  first  predictor  level.  By  the  fourth 
predictor  it  has  reached  a  TS1  that  is  significant  both  to 
chance  and  the  baseline  but  only  with  severe  degradation  to 
both  its  AO  and  TS3  scores.  NATR  does  very  poorly  overall, 
attaining  its  maximum  AO  (45.62)  at  four  predictors,  which 
is  within  baseline  interval,  but  only  marginally  within  the 
TAU-00  baseline  confidence  interval.  Unlike  MAXPROB  I  and 
MAX P ROB  II,  NATR  is  not  able  to  predict  category  I  with 
any  acceptable  credibility,  even  after  four  predictors. 

It,  too,  retained  TS2  and  TS3  values  that  are  not  significantly 
different  than  the  TAU-00  baseline. 

On  the  other  hand,  the  PR+BMD  model  (Fig.  26  a-c) 
produced  very  different  results.  It  selected  a  grouping 
size  of  six  equally  populous  intervals,  reaching  its  peak 
AO  for  MAXPROB  I  and  II  at  the  second  predictor.  While 
slightly  lower  than  the  AO  for  PR,  the  identical  results  of 
MAXP ROB  I  and  II  show  some  skill  at  forecasting  category 
I.  MAXPROB  I  shows  a  TSl  of  .13  and  MAXPROB  II  shows  a  .17, 
both  of  which  are  significant  improvements  over  the  baseline, 
and  in  the  case  of  MAXPROB  II  is  marginally  significant  with 
respect  to  chance.  Although  still  lagging  behind  in  AO  by 
nearly  3%,  NATR  also  shows  significant  improvement  over 
baseline  in  TSl  but  not  enough  to  be  considered  significant 
with  respect  to  chance. 
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Baseline  Interval 


AO  : 

42  . 

97 

to  50.97 

TSl : 

o 

o 

to 

.02 

TS  2  : 

.32 

to 

.  39 

TS3  : 

.36 

to 

.44 

The  MLDC  moves  the  threshold  to  1.91,  resulting  in  a  decrease 
in  AO  of  nearly  5%,  falling  outside  the  baseline  confidence 
interval.  Although  the  TSl  was  raised  to  .17,  it  is  not 
significant  with  respect  to  chance,  and  the  TS2  suffered 
severe  degradation  such  that  it  is  no  longer  significant 
with  respect  to  chance  either. 

The  PR  model  (Fig.  25  a-c)  selected  eight  for  a  group¬ 
ing  size,  which  limited  the  model  to  only  four  predictors 
due  to  a  2  megabyte  limitation  at  the  NPS  computer  center 
(this  is  addressed  in  Chapter  IV  of  this  paper  and  in  Diunizio 
1984) .  All  three  strategies  in  this  time  period  suffer  the 
same  inability  to  forecast  category  I  cloud  amount.  It  is 
not  until  the  fourth  predictor  that  any  of  the  scemes,  namely 
NATR  attains  higher  than  a  .04  TSl.  MAXPROB  I  attains  its 
relatively  high  AO  peak  (50.31)  at  the  second  predictor,  but 
is  unable  to  forecast  any  category  I  at  this  level.  By  the 
fourth  predictor  it  attains  a  TSl  of  .13  while  droppings  its 
AO  to  47.25  and  its  TS2  (.27)  below  baseline  significance. 
MAXPROB  II  strongly  overpredicts  the  category  III  overcast 
situation,  which  gives  it  a  TS3  value  of  .45.  While  this 
is  a  statistically  significant  improvement  over  the  baseline. 
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The  BMD  single-stage  regression  (Figs.  21-23)  starts 
to  show  signs  of  deterioration  with  time  as  one  would  expect. 
Again,  EVAR  has  the  highest  AO  at  46.97,  which  is  significant 
with  respect  to  chance  and  is  not  significantly  different 
from  the  TAU-00  baseline,  but  it  is  near  the  lower  limit  of 
that  baseline  confidence  interval.  Most  of  the  degradation 
takes  place  in  the  TSl  category,  which  is  not  doing  well  at 
TAU-00,  but  is  doing  even  worse  at  TAU-24.  BMD  EVAR  and 
QUAD  are  almost  unable  to  distinguish  any  category  I  obser¬ 
vations  from  category  II  (TSl  of  .01).  At  the  same  time 
both  TS2  and  TS3  remain  within  the  confidence  interval  for 
the  TAU-00  baseline,  showing  no  significant  difference.  The 
BMD  equation  yields  an  even  smaller  separation  between  the 
means  of  category  I  and  II  than  was  seen  in  TAU-00.  In  this 
case,  the  mean  for  clear  scattered  case  is  2.068  with  a 
standard  deviation  of  .232,  and  for  the  broken  group,  the 
mean  is  .214  with  a  standard  deviation  of  .223.  The  obvious 
problem  here  is  that  the  separation  between  the  means  is 
less  than  one-third  that  of  the  standard  deviations!  This 
is  a  tough  problem  for  any  threshold  model.  EVAR  and  QUAD 
produced  thresholds  of  1.679  and  1.614  respectively,  both 
well  left  of  the  mean  of  the  scattered  cloud  group,  account¬ 
ing  for  the  almost  zero  forecasting  of  category  I  cloud 
amount.  The  BMD  EVAR  results  lead  to  the  following  baseline 
intervals  (Fig.  24): 
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NATR  on  the  other  hand  attained  the  highest  AO  yet  received 
by  any  method  or  area  at  53.43%.  This  fell  .01%  above  the 
baseline  interval  and  therefore  could  be  considered  to  be 
a  marginally  significant  improvement  over  the  baseline  BMD 
model.  NATR  also  showed  significant  improvement  over  base¬ 
line  in  TSl  (compare  .42  with  the  upper  limit  of  .39) . 
Although  NATR  remained  within  the  interval  of  significance 
for  the  baseline  in  TSl,  it  still  fell  short  of  statistical 
significance  with  respect  to  chance. 

It  is  clear  by  all  measures  that  the  PR+BMD  method, 
specifically  the  MAXPROB  I  strategy,  achieved  the  best 
results.  It  is  significant  that  none  of  the  methods  could 
forecast  category  I  cloud  amounts  with  a  skill  level  better 
than  pure  chance. 

2.  Area  2,  TAU-24  (Table  VII) 

The  TAU-24  time  period  has  an  extra  five  Model  Output 
Parameters  (MOP's)  added  to  the  available  predictors.  All 
other  MOP's  and  derived  parameters  remained  the  same  (see 
Appendix  D) . 

The  following  is  the  confidence  intervals  for  signi¬ 
ficance  with  respect  to  chance  (Fig.  20): 

Significance  Test 
AO  :  29.43  to  37.35 

TSl:  .11  to  .17 

TS 2  :  .18  to  .25 

TS  3 :  .20  to  .27 
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but  also  attaining  better  scores  than  baseline  in  all  four 
scores.  This  improvement  over  baseline  however  is  not 
enough  to  be  above  the  confidence  interval  for  improvement, 
although  nearly  so  in  both  TSl  and  TS2.  NATR  did  not  attain 
peak  AO  until  the  third  predictor  (48.07)  but  still  3% 
lower  than  MAXPROB  I  or  II.  NATR  displayed  an  actual  signi¬ 
ficant  improvement  over  the  baseline  interval  in  the  TS2 
score  (.42)  but  fell  below  baseline  significance  in  TS3. 

When  allowed  to  progress  to  five  predictors  not  only  did 
NATR  continue  to  improve  at  the  next  step,  but  also  each  of 
the  three  schemes  improved  in  TSl  (MAXPROB  II  scored  a 
TSl  =  .20  at  the  fourth  predictor)  while  degrading  TS2, 

TS3  and  AO.  This  might  be  significant  at  some  time  if  TSl 
were  decided  to  be  the  most  important  category  to  forecast. 

The  PR+BMD,  Fig.  19  a-c ,  selected  a  grouping  size 
of  six,  and  attained  peak  AO  for  MAXPROB  I  at  four  predictors 
and  for  MAXPROB  II  and  NATR  at  three  predictors.  MAXPROB 
I  attained  the  same  AO  as  it  did  in  the  PR  model  but  improves 
its  TSl  and  TS 3  scores.  Although  TSl  did  not  improve  to 
significance  with  respect  to  chance,  it  did  improve  signi¬ 
ficantly  over  the  baseline  (compare  .17  to  .11).  TS2  was 
nearly  equal  to  baseline,  but  TS3  was  at  the  upper  limit 
of  the  baseline  confidence  interval  for  improvement. 

MAXPROB  II  did  not  fare  as  well  as  MAXPROB  I  overall  but  TS2 
and  TS3  did  remain  within  the  confidence  interval  of  the 
baseline,  showing  no  significant  difference  or  improvement. 


has  a  mean  of  2.139  with  a  standard  deviation  of  .246.  With 
the  means  of  the  two  categories  being  separated  by  less 
than  one-half  of  a  standard  deviation,  it  is  easy  to  see 
why  the  TSl  is  so  low.  Both  the  EVAR  and  QUAD  models  place 
the  threshold  value  separating  category  I  from  II  well  to 
the  left  of  the  mean  of  category  I  (EVAR  threshold  =  1.705 
and  QUAD  threshold  =  1.643).  This  situation  holds  throughout 
the  area  2  testing  of  the  BMD  model.  The  QUAD  model  shows 
only  slight  variation  from  EVAR,  which  is  not  surprising 
since  the  thresholds  are  very  nearly  the  same.  The  MLDC 
model  moves  the  threshold  between  category  I  and  II  to  1.864. 
This  significantly  raises  the  TSl  above  the  testing  confi¬ 
dence  interval,  but  loses  2%  on  AO  and  nearly  reduces  TS2 
below  significance  levels.  The  resulting  baseline  confidence 
intervals  from  Fig.  17  are: 

Baseline  Interval 
AO  :  45.40  to  53.42 

TSl:  .07  to  .11 

TS2:  .32  to  .39 

TS3:  .36  to  .44 

The  PR  model.  Figs.  18  a-c,  selected  a  grouping  size 
of  six  equally  populous  intervals  and  achieved  its  peak  AO 
at  the  second  predictor  level  with  MAXPROB  I  and  MAXPROB  II 
(51.26).  Both  schemes  have  the  same  results  at  this  stage, 
not  only  showing  significant  results  in  AO,  TS2  and  TS3 


improvements  accomplished  by  the  other  methods  in  that  time 
period.  The  TAU-00  baseline  is  used  to  compare  all  three 
time  frames  so  that  a  trend  with  time  can  be  evaluated  as 
well . 

1.  Area  2,  TAU-00  (Table  VI) 

The  significance  test  with  respect  to  chance  is 
calculated  in  Fig.  13  and  yields  the  following  intervals: 

Significance  Intervals 
AO  :  29.52  to  37.08 

TSl :  .12  to  .18 

TS2:  .18  to  .25 

TS3:  .19  to  .26 

The  first  model  evaluated  is  the  BMD  single-stage 
regression  using  the  EVAR  threshold,  the  baseline  for 
evaluating  other  models  (Fig.  14).  It  produced  an  AO  that 
is  significantly  better  than  chance  (49.41%  compared  to 
37.08%)  and  very  significant  values  for  TS2  and  TS3 .  In 
fact,  these  TS  values  exceed  the  EVAR  model  in  area  4, 

TAU-00.  However,  the  price  is  paid  in  the  TSl  value.  The 
EVAR  model  was  able  to  obtain  only  a  .09  threat  score  for 
the  clear/scattered  category,  obviously  well  below  the  signi¬ 
ficance  test  for  chance.  This  is  the  result  of  the  BMD 
equations  being  unable  to  clearly  separate  the  clear/ 
scattered  category  from  the  broken  category.  The  historgrams 
show  that  the  equations  result  in  category  I  having  a  mean 
of  2.033  and  a  standard  deviation  of  .257,  while  category  II 
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significantly  different  (although  lower)  than  the  baseline 
interval.  Both  MAXPROB  I  and  II  improved  over  their  PR 
counterparts  in  every  score  except  TS 3 . 

An  interesting  difference  between  the  Preisendorf er 
models  versus  the  linear  regression  models  that  holds 
throughout  the  study  is  in  the  response  of  the  dependent 
scores.  In  the  BMD  models  the  dependent  data  (training 
set)  scores  are  very  near  to  those  of  the  testing  (independent) 
scores,  whereas  in  the  Preisendorf er  schemes  the  dependent 
AO  scores  typically  rise  to  values  above  90%  with  the  addi¬ 
tion  of  the  fifth  predictor.  This  may  indicate  that  the 
PR  models  do  an  excellent  job  of  fitting  the  training  sample 
but  do  not  make  proper  inference  concerning  the  structure  of 
the  population  from  which  the  sample  was  drawn. 

B.  NORTH  ATLANTIC  OCEAN  AREA  2  CLOUD  AMOUNT 

Area  2  (Fig.  2)  encompasses  a  geographic  region  that 
extends  from  the  southeastern  tip  of  Newfoundland,  across 
the  North  Atlantic  Ocean  to  the  eastern  coast  of  England, 
north  to  the  Five  Fingers  of  Iceland  and  back  to  the  Canadian 
coast  north  of  Newfoundland.  Area  2  was  studied  through 
all  three  time  periods,  TAU-00 ,  TAU-24  and  TAU-48,  and  each 
will  be  discussed  separately.  As  in  area  4,  a  null  hypothe¬ 
sis  is  generated  for  each  time  period  to  evaluate  the  signi¬ 
ficance  of  the  results  versus  chance.  Also,  as  in  area  4, 
a  set  of  confidence  intervals  based  on  the  BMD  SS  model  for 
each  time  period  is  used  as  the  baseline  for  measuring 
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The  PR  model  results  are  shown  in  Figs.  11-12.  The 
model  selected  a  grouping  size  of  six  and  the  peak  results 
were  attained  by  all  three  strategies  at  the  five  predictor 
level.  Natural  regression  (NATR)  produced  the  highest  AO 
(45.36)  for  this  model,  followed  by  MAXPROB  I  (44.14)  and 
MAXPROB  II  (41.14).  Though  all  of  the  strategies  had  signi¬ 
ficant  AO  values  compared  to  chance,  none  of  them  improved 
on  the  AO  of  the  baseline.  MAXPROB  I  improved  on  BMD  for 
TS 3 ,  but  lagged  in  other  scores.  The  scores  all  lie  in  or 
below  the  confidence  interval  for  the  baseline  and,  therefore, 
cannot  be  considered  to  be  significantly  different.  MAXPROB 
II  did  appreciably  worse  in  that  it  showed  significance 
with  respect  to  chance  but  its  AO  and  TS2  were  below  the 
baseline  interval.  NATR,  with  the  best  AO  of  the  three  PR 
strategies,  lost  skill  in  category  I,  as  indicated  by  TSl, 
and  this  is  not  even  significant  with  respect  to  chance.  Its 
TS2  and  TS3  were  not  significantly  different  than  the  base¬ 
line  values.  It  is  of  interest  to  note  that  the  PR  scheme 
and  the  BMD  single-stage  model  did  not  select  any  common 
predictors . 

The  PR+BMD  scheme  selected  a  grouping  size  of  six  and 
attained  peak  AO  values  for  MAXPROB  I  and  II  at  two  variables 
and  for  NATR  at  three  variables .  In  this  case  both  the 
MAXPROB  I  and  II  produce  near  equal  results,  with  MAXPROB 
II  showing  slightly  higher  AO,  TSl  and  TS2.  In  all  three 
cases  the  AO  was  significant  compared  to  chance  but  not 
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again,  by  moving  the  thresholds  with  the  ML DC  model,  a 
decrease  in  AO  is  observed  (43.67%)  with  significant  in¬ 
creases  in  TSl  and  TS 3 .  This  increase  in  TSl  and  TS3  is 
also  at  the  expense  of  decreasing  TS2  below  significance 
levels . 

Because  the  results  of  the  single-stage  regression 
were  so  much  better  than  the  two-stage,  and  in  view  of  the 
fact  that  all  the  homogeneous  areas  of  the  North  Atlantic 
Ocean  dispaly  similar  distributions,  the  single-stage  model 
was  pursued  for  the  remainder  of  the  cloud  amount  experi¬ 
ments.  A  single  exception  will  be  discussed  later.  In 
Chapter  IV  it  was  mentioned  that  because  linear  regression, 
of  some  form,  is  the  traditional  method  for  MOS  studies  and 
operational  models,  it  would  be  selected  as  a  baseline 
measurement  (in  addition  to  the  confidence  interval  generated 
by  the  null  hypothesis  contingency  table)  to  measure  the 
skill  of  the  other  methods.  The  single-stage  BMD  with  the 
EVAR  threshold  model  was  selected  as  this  "baseline"  measure. 
Fig.  10  shows  the  development  of  the  confidence  intervals 
for  area  4  baseline.  The  resulting  intervals  are: 

Baseline  Intervals 
AO  :  43.21  to  49.19 

TSl:  .25  to  .31 
TS2 :  .30  to  .36 


TS3 


.24  to  .30 


However,  the  threat  scores  for  category  I  and  III  are  below 
the  confidence  interval  for  pure  chance.  Th  poor  results 
are  most  likely  a  reflection  of  the  particular  nature  of 
the  frequency  of  occurrence  in  each  of  the  observation 
categories.  The  two-stage  regression  was  chosen  in  the 
visibility  studies  because  of  a  very  low  occurrence  (most 
cases  less  than  5%  of  total  observations)  of  low  visiblity. 
Since  low  visiblity  was  the  threat  most  desired  to  predict, 
the  two-stage  regression  was  chosen  to  more  skillfully  pre¬ 
dict  a  low  frequency  category.  In  area  4,  on  the  other  hand, 
the  frequency  of  observation  is  nearly  the  same  for  all  three 
categories  of  cloud  amount.  The  thresholds  of  the  two  stages 
were  moved  closer  to  the  middle  in  the  MLDC  model  (Fig.  6) 
in  order  to  better  predict  the  outside  two  categories,  I  and 
III.  The  resulting  AO  is  2%  lower  than  the  EVAR  or  QUAD 
models,  and  an  increase  in  threat  scores  for  both  categories 
I  and  III  occurred.  Only  the  new  threat  score  for  category 
I  (.33)  increased  beyond  the  significance  level,  but  a  large 
price  was  paid  in  the  TS2,  which  dropped  to  .27,  close  to  the 
significance-level  boundary. 

Since  the  frequencies  of  occurrence  for  the  three 
categories  are  nearly  equal,  a  single-stage  regression  model 
was  next  attempted  (Figs.  7-9).  The  EVAR  threshold  model 
demonstrates  only  a  1.0%  increase  in  AO,  but  much  more 
importantly,  all  three  categories  have  threat  scores  signi¬ 
ficantly  above  chance.  The  QUAD  model  has  a  slightly  higher 
AO  (46.67%)  than  EVAR  and  very  similar  threat  scores.  Once 


attained  by  that  particular  strategy,  while  the  dependent 
scores  reflect  the  results  attained  using  the  first  five 
predictors  selected. 


A.  NORTH  ATLANTIC  OCEAN  AREA  4  CLOUD  AMOUNT 

Area  4  was  selected  as  the  first  for  evaluation  because 
of  its  large  sample  size  and  nearly  equally  populous  obser¬ 
vation  categories  I,  II  and  III.  This  area  encompasses  a 
broad  region  of  the  North  Atlantic  Ocean  with  the  southern 
border  reaching  to  the  northeastern  tip  of  Portugal  and 
extending  northward  through  the  English  Channel  to  encompass 
the  southern  portion  of  the  North  Sea  (Fig.  2) . 

1 .  Area  4,  TAU-00  (Table  V) 

The  following  are  the  confidence  intervals  for 
significance  with  respect  to  chance  (Fig.  3): 

Significance  Intervals 
AO  :  30.53  to  36.19 

TSl :  .17  to  .22 

TS2:  .20  to  .25 

TS3 :  .15  to  .20 

The  first  model  tested  is  the  two-stage  BMD  in  the 
same  manner  as  the  previous  NPS  visibility  studies  (Karl, 
1984;  Diunizio,  1984)  .  The  results,  .c.  n  in  Figs.  4  and 
5  (EVAR  and  QUAD) ,  show  an  AO  of  45.27%  which  is  significant 
with  respect  to  chance  and  category  II  threat  score  (TS2) 
of  .40  which  is  also  highly  significant  compared  to  chance. 


V.  RESULTS 


The  procedures  for  the  experimentations  on  predicting 
cloud  amount  and  ceiling,  as  specified  in  Chapter  IV,  were 
followed  for  the  North  Atlantic  Ocean  homogeneous  areas  2 
and  4.  These  homogeneous  areas  are  displayed  in  Fig.  2. 

The  results  of  these  procedures  are  summarized  in  Tables  V 
through  IX,  and  detailed  results  are  displayed  in  Figs.  3 
to  44.  This  chapter  discusses  the  results  and  significance 
of  each  area  and  each  model  run  using  the  information  on 
these  figures.  Cloud  amount  is  pursued  first  in  the  study 
since  it  is  important  to  the  prediction  of  ceilings,  as  noted 
in  Chapter  III. 

The  terms  used  throughout  this  section  are  defined  in 
Chapter  IV.  The  linear  regression  models  are  referred  to 
as  BMD  and  the  three  threshold  models  are  Equal  Variance 
(EVAR) ,  Quadratic  (QUAD)  and  Maximum  Likelihood  Decision 
criteria  (MLDC) .  The  Preisendor f er  method  is  used  both  with 
( PR+BMD)  and  without  (PR)  linear  regression  equation  predic¬ 
tors.  In  each  model  AO  (total  percent  correct)  is  used  as 
the  criterion  for  the  "best"  model.  In  the  PR  and  PR+BMD 
models,  a  contingency  table  is  generated  for  all  three 
strategies,  MAXPROB  I,  MAXPROB  II  and  natural  regression, 
with  the  addition  of  each  new  predictor.  In  all  cases, 
the  independent  score  discussed  reflects  the  best  score 
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North  Atlantic  Ocean  was  tested  at  all  three  time  increments, 
TAU-00,  TAU-24  and  TAU-48,  in  the  same  manner,  but  without 
the  two-stage  regression.  Finally  area  2,  TAU-00,  was  tested 
using  the  measures  of  separability  and  clustering  techniques. 

Testing  on  ceiling  height  prediction  was  limited  to 
area  2,  TAU-00,  using  initially  the  same  methodology.  An 
experiment  was  then  made  to  test  the  ability  to  forecast 
ceilings  given  perfect  skill  at  predicting  cloud  amounts. 

In  this  case  the  categorized  cloud  amount  was  used  as  a 
predictor  in  the  ceiling  prediction  methodologies.  The 
results  of  each  of  these  tests  are  discussed  in  the  next 
chapter,  and  are  summarized  in  Tables  V  through  IX  and  Figs. 

4  through  44. 


In  general,  it  can  be  observed  that  an  expected 
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degradation  is  experienced  from  TAU-00  to  TAU-24  in  all  of 
the  methodologies.  The  inability  to  forecast  category  I  is 
the  most  glaring  problem.  At  TAU-00  the  skill  levels  are 
poor  in  forecasting  category  I,  but  in  TAU-24  they  become 
nearly  zero  in  all  but  the  PR+BMD  method.  In  no  case  are 
any  of  the  methods  able  to  attain  significant  skill  in  fore¬ 
casting  scattered  clouds  in  comparison  to  pure  chance. 
However,  it  would  be  fair  to  observe  that  the  AO  degradations 
in  general  are  not  as  significant  as  one  might  have  expected. 

3 .  Area  2,  TAU-48  (Table  VIII) 

The  area  2  TAU-48  time  period  also  has  the  five 
extra  predictors  mentioned  above  in  TAU-24.  The  following 
is  the  confidence  intervals  for  significance  of  the  skill 
scores  with  respect  to  lance  for  TAU-48,  area  2  (see  Fig. 

27)  : 


Significance  Test 
AO  :  29.30  to  37.11 

TS1:  .12  to  .18 

TS2:  .18  to  .25 

TS  3 :  .20  to  .26 

As  one  would  expect,  the  models  continue  to  experience 
a  degradation  with  time.  The  BMD  EVAR  model  attains  only 
an  AO  of  45.32  which,  although  well  above  the  significance 
test  for  chance,  still  falls  below  the  TAU-00  baseline  confi¬ 
dence  level  indicating  that  it  is  significantly  worse,  and 
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that  considerable  degradation  has  occurred.  Additionally, 
the  threat  scores  for  category  II  and  III  are  only  marginally 
within  the  TAU-00  baseline  confidence.  Interestingly, 
though  only  by  a  small  amount  (TS1  =  .04) ,  the  BMD  TAU-48 
is  able  to  forecast  category  I  better  than  TAU-24.  The 
overall  degradation  in  performance  is  clearly  seen  in  dis¬ 
tributions  of  the  three  categories  by  the  BMD  equation. 

Between  the  category  means  for  I  and  II  there  is  now  only  a 
separation  of  .05  while  the  standard  deviations  are  of  the 
order  of  .18.  This  same  degradation  is  seen  in  the  separation 
of  categories  II  and  III,  where  the  means  are  now  2.192  and 
2.270  and  the  standard  deviations  are  .178  and  .189,  respec¬ 
tively.  This  shrinking  of  the  separation  of  the  means  is  to 
the  point  at  TAU-48  that  the  QUAD  model  is  unable  to  produce 
a  non-imaginary  threshold  between  category  I  and  II.  MLDC 
also  performs  consistent  with  previous  time  periods,  this 
time  reducing  the  AO  to  42.93  which  is  only  6%  better  than 
the  upper  limit  on  chance.  In  fact,  only  the  MLDC  TS3 
proves  to  be  significantly  better  than  the  pure  chance 
contingency  table.  These  results  lead  to  the  following 
TAU-48  baseline  interval  (Fig.  30) : 


Baseline  Interval 


41.31  to  49.22 


.02  to  .06 


.28  to  .36 


.32  to  .40 


The  PR  model  (Fig.  31  a-c)  also  experiences  the  same 
degradation  with  time.  The  model  selected  grouping  size 
seven  and  reached  peak  AO  at  three  predictors  for  MAXPROB  I, 
two  predictors  for  MAXPROB  II  and  four  predictors  for  NATR. 
MAXPROB  I  loses  3.5%  in  AO  from  TAU-24,  and  also  drops  below 
the  significance  limit  for  baseline  TS2 .  At  the  same  time 
though,  it  improves  on  the  baseline  for  TS1,  though  not  enough 
to  be  considered  significant  with  respect  to  chance.  MAXPROB 
II  fares  somewhat  worse  in  every  category  except  TS2  where 
it  maintains  a  score  within  the  baseline  interval.  When  com¬ 
pared  for  time  degradation  with  the  TAU-00  baseline,  it  is 
only  marginally  within  the  baseline  interval  for  AO  and  TS1 
and  just  below  for  TS3.  Likewise,  NATR  scores  are  within 
the  48-h  baseline  interval,  with  the  exception  of  TS1,  but 
are  significantly  worse  than  the  TAU-00  baseline  in  every 
category  with  the  exception  of  TSl. 

The  PR+BMD  selected  a  grouping  size  of  six  and  reached 
its  peak  AO  at  the  first  predictor  level.  The  identical 
scores  of  MAXPROB  I  and  MAXPROB  II  show  statistically  signi- 
cant  improvement  over  the  TAU-48  baseline  in  both  AO  and 
TS3.  However,  the  TSl  scores  of  zero  reveal  its  inability 
at  this  time  period  to  forecast  category  I.  When  the  model 
runs  out  to  five  predictors,  where  NATR  peaks  on  AO,  then 
it  can  be  seen  that  all  three  schemes  forecast  category  I 
with  a  TSl  equal  to  or  in  excesss  of  .20.  For  example, 

MAXPROB  II  at  the  fourth  predictor  has  an  AO  of  45.94,  but 
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is  significant  in  all  threat  scores  with  respect  to  chance 


(i.e.,  TSl  =  .26,  TS2  =  .28,  TS3  =  .34).  NATR  again  performs 
well  below  the  MAXPROB  strategies,  even  though  its  AO, 

TS2  and  TS3  scores  are  within  the  baseline  confidence  interval 
In  general,  it  can  be  said  that  all  the  schemes 
suffered  significant  degradations  due  to  the  48-hour  time 
period.  Forecasting  category  I  (scattered/clear)  remains 
a  problem  through  all  time  periods,  and  is  only  forecastable 
at  large  cost  to  the  other  threat  scores  and  the  total 
percentage  correct. 

4 .  Area  2,  TAU-OQ  Experiments  in  Clustering  and 
Separability  (Table  VII) 

A  brief  description  of  the  separability  and  cluster 
methods  and  procedures  is  found  in  Chapter  IV,  and  a  more 
detailed  theoretical  description  is  found  in  Appendices  B 
and  C.  The  results  of  the  measures  of  separability  program 
are  listed  in  Tables  X-XII  and  the  clustering  of  variables 
is  listed  in  Appendix  C.  The  baseline  for  comparison  in 
these  examples  is  the  area  2,  TAU-00  BMD  using  the  EVAR 
threshold,  and  the  null  hypothesis  significance  confidence 
intervals  used  for  TAU-00,  area,  2. 

The  first  test  consisted  of  selecting  the  predictors 
from  the  category  I  versus  II  grouping  of  the  measures  of 
separability,  using  those  predictors  with  the  highest 
divergences.  As  Table  X  shows,  the  values  for  the  three 
measures  were  very  low  in  this  grouping,  which  is  a  possible 
clue  to  the  low  skill  attained  by  all  the  methods  in  category 


I.  Predictors  were  chosen  that  had  a  divergence  of  .08  or 
higher;  predictors  were  then  used  in  the  BMD  EVAR  model. 

The  results  of  the  test  are  shown  in  Fig.  33.  The  model 
attained  an  AO  of  45.39  which  is  significant  with  respect 
to  chance,  but  is  outside  the  low  end  of  the  confidence 
interval  for  the  baseline.  The  TS2  is  not  significantly 
different  from  baseline  but  TS 3  (.34)  fell  just  below  the 
lower  limit  of  the  baseline  value.  Most  importantly,  though, 
the  model  did  not  predict  any  category  I.  Although  this  is 
well  below  the  baseline  value,  the  baseline  values  are 
significantly  worse  than  chance. 

The  second  test,  found  in  Fig.  34,  is  similar  to  the 
first,  with  the  exception  that  the  variables  were  selected 
from  the  category  II  versus  III  grouping  of  the  measures  of 
separability  (i.e.,  predictors  with  values  above  .35). 

These  measures  showed  much  higher  values,  which  is  consistent 
with  the  results  of  the  methods  in  area  2,  TAU-00  where  the 
threat  scores  for  category  II  and  II  are  very  much  higher 
(i.e.,  the  models  are  able  to  separate  II  from  III  much 
easier).  This  time  the  AO  improved  to  48.91,  nearly  equaling 
the  baseline  value.  The  model  also  equalled  baseline  per¬ 
formance  in  threat  scores  TS2  and  TS3.  Once  again,  however, 
the  model  is  unable  to  forecast  any  category  I  observations. 

The  third  test  tries  to  combine  the  clustering  infor¬ 
mation  with  the  measures  of  separability.  In  this  case  the 
clusters,  listed  in  Appendix  C,  were  used  as  the  initial 
sorting  of  predictors.  Next,  the  predictor  from  each  cluster 


that  has  the  highest  divergence  in  the  category  I  versus  III 
grouping,  was  selected  as  a  variable.  These  variables  were 
then  used  in  the  BMD  EVAR  and  the  results  are  shown  in  Fig. 

35.  This  model  did  rather  poorly  and  only  stayed  in  the 
confidence  interval  for  the  baseline  in  TS2.  All  other 
scores  dropped  below  the  intervals  for  baseline  while 
remaining  significant  with  respect  to  chance  (except  for 
TSl)  . 

The  fourth  test  attempted  to  utilize  all  the  informa¬ 
tion  available.  The  clustering  technique  was  combined  with 
the  measures  of  separability  to  select  variables  that  would 
best  separate  category  I  from  II,  and  then  those  that  would 
best  separate  category  II  from  III.  These  two  sets  of  pre¬ 
dictors  were  then  used  in  a  two-stage  regression  first 
separating  category  I  from  II+III  and  then  II  from  III. 

The  results,  shown  in  Fig.  36,  show  much  improvement  over 
the  previous  three  tests.  In  fact,  this  model  produced  the 
highest  AO  attained  by  any  of  the  BMD  models  so  far  studied. 
The  EVAR  threshold  produced  a  50.59  AO  which  is  higher  than 
baseline  but  not  significantly  so,  and  TS2  showed  modest 
improvement  over  the  baseline  interval.  This  model  also 
produced  a  smaller  TSl  (.05)  than  hoped  for,  but  the  fact 
that  it  is  greater  than  zero  is  encouraging. 

It  is  unfortunate  that  more  time  was  not  available 
to  pursue  further  these  methods,  but  the  initial  testing  shows 
some  potential  for  usefulness  in  the  MOS  methods.  There  are 
several  important  points  to  be  made.  First,  the  results  of 

59 


J 


the  measures  of  separability  program  confirms  that  the 
regression  methods  used  in  area  2  are  not  forecasting  category 
I  with  much  skill,  because  the  available  predictors  do  not 
have  enough  information.  Secondly,  the  clustering  proves 
to  be  more  valuable  if  the  predictors  are  scaled  some  way 
to  prevent  all  the  velocity  predictors  being  clustered  and 
the  height  predictors  being  clustered,  etc.  (It  is  possible 
that  this  type  of  result  is  not  due  to  scaling  but  rather  to 
characteristics  of  the  model  producing  the  parameters) . 

Thirdly,  the  measures  of  separability  give  high  values  to 
most  of  the  predictors  chosen  by  the  two  methods  generally 
used  in  this  study.  That  lends  plausiblity  to  its  usefulness 
as  a  predictor  screening  agent  to  reduce  the  number  of  pre¬ 
dictors  being  forced  through  the  various  prediction  strategies. 

C.  NORTH  ATLANTIC  OCEAN  AREA  2  CEILINGS 

The  first  experiments  in  forecasting  ceiling  were  carried 
out  using  a  direct  application  of  the  methods  employed  for 
cloud  amount  and  previously  for  visibility.  The  frequencies 
of  distribution  of  ceilng  observations  for  the  North  Atlantic 
Ocean  are  shown  on  Table  IV.  Area  2  was  chosen  for  experi¬ 
mentation,  consistent  with  the  concentration  of  MOS  visibility 
and  cloud  amount  effort.  The  second  set  of  experiments  is 
designed  to  evaluate  the  skill  of  forecasting  ceiling  given 
that  there  exists  perfect  skill  at  forecasting  cloud  amount. 

The  cloud  amount  observations  are  then  categorized  and  used 
as  a  parameter  in  the  various  methods.  As  in  the  studies  on 
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cloud  amount,  a  null  hypothesis  is  established,  using  a 
contingency  table,  based  on  each  category  having  an  equal 
probability  of  being  forecasted  for  each  observation.  This 
yields  the  following  95%  confidence  intervals  for  evaluating 
the  significance  of  the  results  (see  Fig.  37): 

Significance  Intervals 
AO  :  29.23  ro  36.77 

TS1:  .13  to  .19 

TS2:  .20  to  .27 

TS3 :  .17  to  .23 


1.  Area  2,  TAU-00  Ceiling  Tests  Without  Cloud  Amount 
(Table  IX) 

The  results  of  the  BMD  single-stage  regression  is  shown 
in  Fig.  38.  The  resulting  AO  is  significant  with  respect 
to  chance  and  is  very  similar  to  the  values  obtained  in  the 
cloud  amount  studies.  Threat  scores  for  category  I  and  II 
are  both  well  above  significance  with  TS 2  being  the  highest 
at  0.42.  However,  the  single-stage  model  is  unable  to  dis¬ 
criminate  between  category  II  and  III.  This  is  shown  in 
the  TS 3  score  of  .00  and  in  the  histograms  displaying  the 
distributions  of  the  BMD  equations.  The  means  have  good 
separation  between  category  I  and  II  (1.666  and  1.861  respec¬ 
tively,  with  standard  deviations  of  .254  and  .225).  The 
problem  occurs  between  category  II  and  III  where  the  means 
and  standard  deviations  are  1.852  and  0.192  for  category  II 
and  1.873  and  0.183  for  category  III.  The  mean  separation 


is  nearly  one-tenth  that  of  the  standard  deviations.  The 
separation  is  so  small  that  the  QUAD  model  is  unable  to 
resolve  a  non-imaginary  threshold.  In  this  case  the  MLDC 
model  shows  promise.  By  moving  the  threshold  halfway  between 
II  and  III,  TS3  increased  from  .00  to  .15,  which  is  not  enough 
to  be  significant  compared  to  chance,  but  it  is  noteworthy 
that  the  AO  also  increased  by  1%  and  there  is  little  effect 
(-.03)  on  TS2.  As  in  the  cloud  amount  studies,  the  BMD 
single-stage  regression  will  be  used  as  the  baseline  for 
evaluating  other  methods.  In  this  case,  however,  the  BMD 
with  MLDC  threshold  will  be  used.  This  produces  the  following 
confidence  intervals  (see  Fig.  40): 

Baseline  Interval 
AO  :  41.56  to  49.56 

TS1:  .19  to  .25 

TS2:  .35  to  .42 

TS  3 :  .12  to  .18 

The  PR+BMD  model  chose  a  grouping  size  of  six  and 
attains  peak  AO  for  the  MAXPROB  strategies  at  the  second 
predictor  and  for  NATR  at  the  third  predictor  (Fig.  41  a-c) . 
MAXPROB  I  achieves  the  highest  AO  (47.24)  of  the  three 
strategies  and  shows  very  different  results  in  the  threat 
scores  compared  to  the  baseline.  It  scores  well  above  the 
baseline  interval  for  TSl  and  TS3,  while  showing  only 
marginal  improvement  over  chance  in  TS2.  The  most  striking 
fact  is  that  the  PR+BMD  does  so  well  in  the  category  III 


(TS3  =  .36)  compared  to  the  counterpart  scores  in  the  BMD 
model.  MAXPROB  II  shows  similar  results,  although  slightly 
better  in  TSl  but  slightly  poorer  in  AO,  TS2 ,  and  TS3. 
Actually,  MAXPROB  II  shows  no  skill  compared  to  chance  in 
category  II.  For  NATR,  AO  peaks  at  the  third  predictor  and 
attains  the  second  highest  percentage  correct.  In  general, 
it  does  much  poorer  than  the  MAXPROB  strategies,  giving 
results  that  more  closely  resemble  the  baseline  results. 

NATR  shows  no  significant  difference  from  baseline  in  either 
AO  or  TS2,  while  scoring  significantly  higher  in  TS3  (though 
only  on  the  margin  of  being  significant  with  respect  to 
chance) .  The  TSl  of  0.18  is  not  significant  compared  to 
chance,  but  it  is  significantly  worse  than  baseline. 

In  general,  the  methods  applied  to  ceilng  heights 
produced  very  similar  results  to  those  attained  for  cloud 
amount,  both  in  percentage  correct  and  in  threat  scores. 

BMD  with  MLDC  or  even  EVAR  does  the  best  in  forecasting 
category  II  but  is  poor  in  forecasting  category  I  or  III. 
Conversely,  the  MAXPROB  strategies  are  much  better  at  fore¬ 
casting  categories  I  and  III,  but  at  a  cost  of  reducing  the 
results  for  category  II  below  significant  levels. 

2 .  Area  2,  TAU-00  Ceiling  Using  Cloud  Amount  Observations 

In  these  experiments  cloud  amount  observations  were 
categorized  and  used  as  a  predictor  for  ceiling. 

The  results  of  the  BMD  model  using  cloud  amount 
(Figs.  5,  42-43)  are  excellent  compared  to  the  results 
attained  so  far  in  this  study.  The  EVAR  model  attained  an 


AO  of  67.50  which  is  18%  higher  than  the  baseline  BMD.  The 
threat  scores  are  consistently  high  as  well,  far  exceeding 
the  baseline  in  every  category  by  .10  to  .44.  QUAD  provides 
nearly  identical  results  across  all  categories. 

The  PR+BMD  model  is  then  run  making  cloud  amount 
available  as  a  predictor  (Figs.  5,  44  a-c) .  The  model  chose 
grouping  size  six  and  attains  peak  AO  at  the  second  predic¬ 
tor.  Cloud  amount  is  the  first  predictor  chosen  and  the 
linear  regression  equation  variable  (not  containing  cloud 
amount)  is  the  second.  MAXPROB  I  and  MAXPROB  II  produce 
identical  results  at  this  level  with  an  AO  of  68.60,  about 
1.0%  higher  than  the  BMD  model  using  cloud  amount.  The 
model's  TS 3  is  an  outstanding  .72  and  TS2  is  .49,  both  of 
which  are  very  significant  with  respect  to  chance  and  the 
baseline.  However,  the  resulting  TS1  is  only  .24  which  is 
significant  compared  to  chance  but  is  not  an  improvement  over 
the  baseline.  The  NATR  model  attains  peak  AO  at  the  fourth 
predictor,  and  does  not  perform  as  well  as  MAXPROB  in  any 
category.  The  TS3  of  .64  and  the  TS2  of  .451  show  signifi¬ 
cance  with  respect  to  both  chance  and  the  baseline,  but  the 
TS 1  value  is  not  significantly  different  than  baseline, 
and  only  marginally  significant  compared  to  chance. 

^he  value  of  good  skill  in  predicting  cloud  amount 
in  forecasting  ceiling  is  obvious  by  the  above  results.  The 
very  high  category  III  results  (i.e.,  ceiling  greater  than 
3500  feet  or  unlimited)  is  probably  due  to  the  definition 
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of  ceiling,  namely  that  if  cloud  amount  is  less  than  5/8 
then  the  ceiling  is  unlimited.  The  strongest  effect  of 
cloud  amount  in  the  forecast  of  ceiling  is  whether  or  not 
a  ceiling  exists,  thus  the  high  threat  score  of  category 
III  which  contains  all  the  observations  of  no  ceiling. 
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VI  . 


CON  C  LI' SI 
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A.  CONCLUSIONS 

The  primary  objective  or  mas  s~udy  was  to  begin  the 
investigation  into  statistics'  forecasting  of  cloud  amount 
and  ceiling  by  extending  the  methods  researched  by  Karl 
(1984)  and  applied  by  Diunizio  .1984:  :n  the  area  of  visi¬ 
bility.  The  ultimate  goal  is  to  develop  a  viable  statisti¬ 
cal  forecasting  scheme  suitable  for  eventual  employment  in 
an  operational  U.S.  Navy  marine  ceiling  and  cloud  amount 
MOS  forecasting  system.  This  is  certainly  not  an  exhaustive 
study  of  the  subject,  but  does  provide  an  important  first 
step  in  statistically  forecasting  these  weather  elements. 

The  results  of  the  tests  in  the  various  areas  and  time 
periods  show  that  the  methods  evaluated  are  useful  in  fore¬ 
casting  both  cloud  amount  and  ceilings.  Although  the  models 
are  not  yet  producing  results  as  good  as  one  might  desire 
for  an  operational  MOS  system,  they  are  forecasting  signifi¬ 
cantly  better  than  pure  chance,  giving  them  useful  skill 
levels.  In  area  4,  TAU-00,  the  single-stage  linear  regression 
performed  the  best,  and  became  the  "baseline"  from  which  to 
measure  the  other  methods.  In  area  2  the  model  that  scored 
consistently  highest  in  all  three  time  periods  is  the  PR+BMD . 
The  general  problem  experienced  by  all  the  approaches  in 
area  2  is  the  inability  to  forecast  the  scattered/clear  condi¬ 
tion  (category  I)  with  any  skill.  Significant  skill  in  this 
category  is  only  attained  in  a  very  few  cases  and  then  only 


1 


i 


•  J 


a) 


J 


:  a  great  cost  to  the  threat  scores  of  the  other  two  eate¬ 
ries.  In  the  initial  ceiling  studies,  PR+BMD  gives  the 
:st  overall  results,  but  the  linear  regression  is  able  to 
)re  skillfully  forecast  the  category  II.  In  contrast,  when 
perfect  cloud  amount  forecast  is  added  as  a  predictor  to 
le  ceiling  models,  linear  regression  gives  much  better 
jsults  overall,  especially  in  forecasting  low  ceilings. 

In  the  previous  MOS  studies  in  this  series,  a  low  visi- 
ility  situation  was  clearly  the  most  threatening  category 
3r  operational  Naval  forces  and,  therefore,  was  selected 
5  the  criterion  to  maximize  as  well  as  to  evaluate  one  model 
gainst  another.  In  cloud  amount  predictions,  there  does 
o t  exist  a  single  category  that  is  clearly  more  important 
han  the  other  two.  Ir.  the  absence  of  a  better  measure, 
bsolute  percentage  correct  was  utilized.  The  study  does 
eveal  a  need  to  develop  some  evaluation  criteria  for  contin- 
ency  table  output  for  the  MOS  project  in  general.  This 
ould  be  of  great  assistance  in  the  developmental  stages  of 
arameter  selection  as  well  as  evaluating  the  overall  per- 
ormance  of  a  particular  model.  The  two  measures  used  in 
his  study  to  evaluate  significance  of  the  results  proved 
o  be  very  useful.  The  previous  studies  based  significance 
esting  on  a  Monte  Carlo  scheme  evaluating  a  set  of  100 
andomly  generated  data  sets  to  produce  upper  and  lower  .05 
ritical  values  for  AO.  The  significance  test  used  in  this 
tudy,  derived  as  a  consequence  of  the  central  limit  theorem, 
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Mote  that  in  this  situation  there  are  two  thresholds.  The 
group  having  the  smaller  variance  will  lie  between  the  two 
thresholds . 


F  =  1 


Classification  index  (z) 


The  thresholds  shown  are  typical  of  a  situation  where  p^  <  p^. 
Mote  that  these  thresholds  lie  between  the  two  intersections 
of  the  densities.  If  the  inequality  of  prior  probabilities 
were  reversed,  the  thresholds  would  lie  outside  of  the 
region  between  the  two  density  intersections.  Further,  note 
that  the  decision  region  for  the  group  having  the  lesser 
variance  lies  between  the  thresholds. 

c.  Case  III:  General  Solution  (Referred  to  as  the 
Quadratic  Model  (QUAD)  in  the  text) 

p(z  E  =  1 )  =  k/a1  expU-1/2)  (z  -  p2)  2/a^} 

p(z,E=0)  =  k/oQ  expf  (-1/2)  (z  -  uQ)  2/a2  i 
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where  A  is  the  likelihood  ratio  and  =  p[E  =0]  and 
=  p [ E  =1] .  Thus,  the  threshold  value  is 


z*  =  (u0+  t1)/2  +  o2  In  (pQ/p1)/ (u1  -  uQ) 


The  position  of  the  threshold  depends  on  the  relative  values 
of  and  pQ .  The  threshold  moves  toward  the  croup  with  the 
smallest  p^.  If  p^  =  the  threshold  will  be  the  value  of 
z  where  the  densities  intersect  (i.e.,  where  the  densities 
are  equal ) . 

b.  Case  II:  Equal  means;  different  variances 


: 0 ex p  ■  ( - 1  / 2 )  (  z  -  u  x )  2 /  ■ 2  Cl  pQ 

U-,0)2/-J-  c  =  0  pl 


with  the  threshold 


Z* 
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These  statements  can  be  combined  to  give 


c=l 

p  (z  i  E  =  1)  /p  (2  ;  E  =  0)  =  A  ( z )  ^  p[E  =  0  ]  /  p  [  E  =1] 

c=0 

Thresholds  are  the  value (s)  of  z  for  which 

A  ( z)  =  p[E  =  0]/p[E  =1] 

This  equation  can  be  solved  for  z  either  analytically  or 
numerically  depending  on  the  forms  of  the  density  functions. 
3 .  Threshold  Cases 

In  order  to  exemplify  the  model,  the  assumption  is 

made  that  the  class  conditional  distributions  are  Gaussian. 

There  are  essentially  three  distinct  cases  that  can  arise. 

a.  Case  I:  Equal  variances;  different  means 

(Referred  to  as  the  equal  variance  model  (EVAR) 
in  the  text) 

p (  z  |  E  =  1 )  =  k  exp{  (-1/2)  (z  -  u-j_)  2/o2  } 
p(z|E=0)  =  k  exp{  (-1/2  )  (  z  -  uQ  )  2/o2  } 
where : 


k 


(  2  TT  ) 


-1/2  -1 

o 


expf 

(-1/2) (z  - Ml) 2/o2} 

c=l 

>  po 

exp ' 

(-i/2)  (z  - u0) 2/o2:- 

<  Pi 

c=0 
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then , 


pe  =  p[E=0]  /  p  ( 2  {  E  =  0)  dz  -r  p  [  E  =  1 )  [  1  -  /  p  ( z  |  E  =  1 )  dz  ] 

~  z  £  Z  ^  Zz 

and  algebraic  rearrangement  yields, 

p  =  p  [E  =  1  ]  -  /  {  p  [  E  =  0  ]  p(z|E=0)  -  p  [E  =  1 )  p  (  z  ]  E  =  1 )  dz  } 

zeZ1 

In  order  to  minimize  pg ,  (the  decision  region  for  C  =  1) 
will  include  all  those  values  of  z  for  which  the  integrand 
in  the  expression  for  pg  will  be  negative.  The  decision  regions 
can  be  symbolically  represented  as  follows: 


ZQ  =  {  z  :  p  (E  =  0  ]  p  (  z  |  E  =  0 )  -  p  [E  =  1  ]  p(z;:E=l)  >  0} 

Z ^  =  {z:  p [ E  =  0 ]  p ( Z | E  =  0 )  -  p [ E  = 1 ]  p ( z \ E  = 1 )  <  0} 

An  alternative  representation  is  given  by, 

ZQ  =  {z:  p  [  E  =  0  ]  p  (  z  |  E  =  0 )  >  p  [E  =  1  ]  p(z|E=l).f 

=  (z:  p[E  =0]/p[E  =  1]  >  p  (  z  |  E  =  1 )  /p  (  z  j  E  =  0  )  } 


Likewise , 

Z  =  '  Z  :  p[E  =0]/p[E  =1]  ■  p(z|E  =l)/p(z  E=0); 


The  decision  regions  are  mutually  exclusive  and  exhaustive 
(i.e.,  Z Q  n L ^  =  0  and  Z  =  Z  ^  Z  ^ )  . 

Thresholds  =  boundary (s)  between  decision  regions. 

p(zjE  =0)  e  class  conditional  density  of  z  given 
that  E  =  0 . 

p ( z | E  =1)  i  class  conditional  density  of  z  given 
that  E  =  1. 

A(z)  =  p(z|E  =l)/p(z|E  =0)  =  the  maximum  likelihood 

ratio  (i.e.,  the  ratio  of  class  conditional 
densities ) . 

Pe  =  p{  [c  =  1  0  E  =  0]  ,  [ C  =  0  E  =  1  ]  }  =  the  total 

probability  of  error. 

2 .  Minimum  Probability  of  Error  Criterion 

Pe  =  probability  of  an  incorrect  classification. 

pe  =  p  [C  =  1  |  E  =  0  ]  p[E=0]  +  p  [C  =  0  ]  E  =  1  ]  p  [E  =  1] 


where  p[E=l]  +  p[E=0]  =  1.  Note  that  the  events  E  =  1 

and  E  =  0  are  mutually  exclusive  and  exhaustive.  The  objective 

is  to  select  decision  regions  (thresholds)  so  as  to  minimize  p  . 


p  [C  =  0  |  E  =  i; 


/  p  ( z  |  E  =l)dz  =  the  probability  of 

Z£  Zo 

misclassifying  E  =  1. 


p  [C  =  0  |  E  =1] 


/  p  (  z  |  E  =  1 )  dz  +  /  p(z|E=l)d; 


z?Z, 


z  •£  Z . 


j  p ( z | E  = 1 ) dz 

Z;  Z 


p  [C  =  0  |  E  =  1  ]  =  1  - 


/  p  ( z  |  E  =  1 )  dz 
2  Z]_ 


p(C  = 1 |E  =  0]  =  /  p ( z | E  = 0 ) dz 


these  are 
substituted 
into  the 
expression 
for 


z<Z. 
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B.  THRESHOLDS  (Lowe,  1984a) 

1 .  Notation 

E  =  an  event;  this  is  an  indicator  variable  which 
when  E  =  1,  the  threatening  event  occurs,  and 
when  E  =  0 ,  the  non-threatening  event  occurs . 

C  h  the  classification  of  an  unknown  event  which 
when  C  =  1,  the  event  is  classified  as  a 
threat,  and  when  C  =  0,  the  event  is  classi¬ 
fied  as  a  non-threat. 

P[E  =1]  =  unconditional  probability  of  occurrence  of 

threat . 

P[E=0]  =  unconditional  probability  of  occurrence  of 

non- threat . 

Error  of  the  1st  kind  (false  alarm)  [C  = 1  r  E  =  0]  . 

Error  of  the  2nd  kind  (miss )  [C  =  0  n  E  = 1 ] . 

P[C  =1  n  E  =  0]  =  joint  probability  of  an  error  of  the  1st 

kind . 

P  [C  =0  ,i  E  =1]  =  joint  probability  of  an  error  of  the  2nd 

kind . 

P[C=1|E=0]  =  class  conditional  probability  of  misclassi- 

fying  a  non-threat. 

P[C  =0|E  =1]  =  class  conditional  probability  of  misclassi- 

fying  a  threat. 

P  [C  =  1  oE=0]  P  [C  =  1  |  E  =  0  ]  P  [E  =  0]  . 

P[C=0  ;  E  =  1]  =  P  [C  =  0  |  E  =  1  ]  P  [E  =  0]  . 

z  =  a  value  of  the  predictive  index  (equivalent 
to  y ,  above ) . 

Z  =  range  of  the  predictive  index  on  the  real  line. 


For  a  dichotomous  problem,  Z  is  divided  into  two  parts:  Zq,  Z^, 
C  =  0  if  z  ,  ZQ 
C  =  1  if  z-'Z^ 
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Independent  variable  selection  for  the  BMDP9R  program 
begins  with  a  general  screening  of  the  entire  set  of  potential 
predictors.  Variables  which  are  identified  as  redundant, 
linear  combinations  of  other  variables,  with  respect  to  the 
predictand,  are  deleted  from  further  consideration.  The  t 
statistics  for  the  coefficients  which  minimize  the  Cp  value 
for  each  reviewed  subset  identifies  the  "best"  subset.  The 
number  of  predictors  assigned  to  each  subset  can  be  predefined 
and  for  this  study  each  subset  equation  was  required  to  have 
six  predictors. 

The  role  of  regression,  once  appropriate  predictor  varia¬ 
bles  have  been  selected,  is  simply  that  of  dimension  reduction 
(representing  a  multivariate  structure  by  a  univariate  proxy 
which  constitutes  a  classif icatory  or  predictive  index) . 

This  proxy  takes  the  form  of  a  polynomial,  linear  in  its 
coefficients,  of  the  components  of  the  multivariate  structure. 
The  problem  now  becomes  one  of  determining  the  form  of  the 
state  conditional  distributions  (one  for  each  group  of 
interest;  e.g.,  one,  two  and  three  for  ceiling  categories  I, 

II  and  III,  as  used  in  this  study).  Once  an  appropriate 
form  has  been  selected,  it  remains,  then,  to  determine  the 
parameters  of  the  class  conditional  distributions  (e.g., 
means  and  variances)  and  then  apply  an  appropriate  decision 
criterion  or  threshold  model. 
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APPENDIX  A 

LINEAR  REGRESSION  AND  THRESHOLD  MODELS 

A.  LINEAR  REGRESSION 

The  linear  regression  techniques  used  in  the  study  were 
first  presented  by  Karl  (1984)  and  extended  by  Diunizio  (1984). 
The  least-squares  multiple  linear  regression  problem  used  in 
the  study  is  the  BMDP9R,  all  possible  subsets  regression 
computer  program,  found  in  the  BMDP  Statistical  Software 
Package  (University  of  California,  1983). 

The  BMDP9R  program  employs  a  "best"  possible  subset, 
derived  independently  of  variables  or  variable  sequence, 
calculated  from  the  group  of  potential  predictors.  Once  this 
"best"  subset  is  identified,  a  linear  regression  equation  is 
fitted  to  the  data,  based  only  upon  those  selected  predictors. 
The  "best"  possible  subset  is  identified,  a  linear  regression 
equation  is  fitted  to  the  data,  based  only  upon  those  selected 
predictors.  The  "best"  possible  subset  is  calculated  by  a 
Furnville-Wilson  algorithm  which  provides  the  user  with  a 
variety  of  subordinate  subsets  in  addition  to  the  "best"  sub¬ 
set.  Three  criteria  are  available  to  define  the  "best" 
possible  subset  as  a  function  of  independent  variables  (pre¬ 
dictors)  and  a  dependent  variable  (predictand) :  the  sample 
R,  the  adjusted  R,  and  Mallow's  Cp.  The  Mallow's  Cp  criteria 
is  used  in  this  study,  where  "best"  is  defined  as  the  smallest 
Cp  value. 
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an  additional  parameter.  For  example,  if  the  particular 
parameter  at  TAU-00,  TAU-24  and  TAU-48  is  given  the  designa¬ 
tion  SI,  S2  and  S3,  respectively,  then  the  time  differencing 
could  be  accomplished  thus: 

TAU-00  forecast  period  would  use  forward  difference 
[3xS3  -  4 *s 2  +  S1J/48 

TAU-24  forecast  period  would  use  centered  difference 
[S3  -  2xS2  +  S1J/24 

TAU-48  forecast  period  would  use  backward  differencing 
[-S1  +  4 xs 2  -  S3 ] /4 8 

These  new  parameters  could  then  be  used  as  predictors  in  the 
models . 

4.  A  new  predictor  also  could  be  developed  at  each 
time  period  by  doing  a  spatial  difference  across  the  obser¬ 
vation  points  to  give  a  representation  of  advections  (i.e., 
thermal,  vorticity,  moisture) .  A  potential  scheme  would  be 
to  use  a  centered  difference  at  the  observation  position. 

If  the  parameter  at  the  observation  point  was  labeled 
Q(i,j)  ,  the  east/west  advection  could  be  represented  by 

dQ/dX(i,j)  =  [Q ( i+1 , j )  -  Q(i-l,j)]/2L 

where  L  is  the  distance  between  gridpoints.  A  north/south 
advection  could  be  represented  by 


point  out  one  more  significant  fact.  The  low  values  of 
separation  between  category  I  and  II  for  all  the  predictors 
in  the  area  2  cloud  amount  study  very  obviously  coincide 
with  the  inability  of  any  of  the  forecast  schemes  to  skill¬ 
fully  forecast  category  I.  This,  too,  supports  the  position 
that  new  predictors  or  new  combinations  of  predictors  are 
necessary  to  improve  significantly  the  results  achieved  in 
this  study. 

It  is  of  interest  to  note  that  the  most  frequently  used 
variables  by  both  the  Preisendorfer  and  regression  methods 
include  vorticity  (VOR500,  VOR925,  and  DVRTDZ) ,  low  level 
winds  (UBLW,  U1000) ,  low  level  vapor  pressures  (EAIR,  E850) 
and  products  involving  vapor  pressure  at  700  mb  (VE700, 

TE700)  . 

B.  RECOMMENDATIONS 

Based  on  the  observations  made  in  this  study  and  the 
conclusions  above,  the  following  recommendations  are  offered 
to  future  researchers: 

1.  Interpolate  the  12  GMT  data  base  to  make  TAU-00, 

TAU-24  and  TAU-48  MOP's  available  as  predictors  at  every 
observation  position. 

2.  Interpolate  00  GMT  MOP's  to  the  1200  GMT  ship  position 
to  provide  12-hour  history  as  a  new  predictor. 

3.  If  the  parameters  described  in  1.  and  2.  above  were 
available,  then  a  time  differencing  could  be  done  on  the 
predictors  to  give  time  trend  information  to  the  models  as 
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caused  both  the  PR  and  linear  regression  methods  to  gain 
over  20%  in  percentage  correct,  or  stated  otherwise,  they 
experienced  a  40%  improvement  in  their  overall  percentage 
correct  (i.e.,  increasing  AO  from  47%  to  67%) .  This  leads 
one  to  believe  that  the  emphasis  in  further  MOS  research 
should  not  be  in  pursuing  new  statistical  methods,  but  rather 
in  pursuing  new  combinations  of  old  predictors,  and  new 
predictors . 

The  results  of  the  separability  measures  and  cluster 
analysis  individually  are  not  very  impressive.  The  combined 
use,  however,  of  the  techniques  with  a  two-stage  regression 
give  the  highest  AO  for  the  cloud  amount  regression  schemes, 
and  shows  some  potential  as  a  predictor  selection  scheme. 

The  benefits  of  the  two  techniques  is  that  it  gives  the 
experimenter  some  control  over  the  parameter  selection 
process,  in  contrast  to  the  "black  box"  parameter  selection 
by  the  BIMED  statistical  software  package.  These  methods 
allow  the  experimenter  to  adjust  the  parameter  selection 
according  to  the  category  desired  to  select.  For  example, 
if  the  third  category  is  the  most  difficult  or  most  desired 
category  for  forecasting,  then  the  measures  of  separability 
can  be  used  to  select  predictors  providing  the  maximum 
separability  between  the  desired  categories.  These  two 
methods  also  can  provide  a  screening  process  for  new  param¬ 
eters.  With  the  present  models,  to  evaluate  the  potential 
of  a  single  new  parameter,  the  entire  model  must  be  run  again 
from  the  beginning.  The  results  of  the  measures  of  separability 


proved  to  be  much  simpler  and  less  time  consuming  to  apply. 

The  test  gives  a  good  first  approximation  of  the  significance 
of  any  particular  model  run.  The  second  tool  for  evaluating 
the  results,  the  95%  confidence  intervals  derived  from  a 
baseline  model  contingency  table,  is  very  useful  in  comparing 
the  improvement  of  each  model,  and  is  especially  insightful 
in  evaluating  degradation  of  results  over  the  48-hour  time 
period . 

It  becomes  clear  after  the  first  few  uses  of  the  various 
models,  that  the  linear  regression  techniques  are  much  more 
easily  handled  in  the  developmental  stages  than  the  PR  or 
PR+BMD  models.  When  placed  into  an  operational  MOS  system 
the  PR  models  will  require  several  orders  of  magnitude  more 
computer  memory  storage  space  than  its  linear  regression 
counterpart.  For  these  reasons,  it  would  seem  that  if  the 
PR  methods  tested  here  are  to  be  of  viable  use  operationally, 
they  must  be  able  to  perform  significantly  better  than  the 
linear  regression  models. 

The  results  of  the  ceiling  experiment  are  very  encouraging 
indeed.  The  first  conclusion  from  this  set  of  experiments 
is  that  the  premise  early  in  the  study  that  good  skill  in 
forecasting  cloud  amount  will  be  valuable  in  forecasting 
ceiling  heights  is  correct.  The  second  conclusion  is  that 
the  results  support  the  idea  that  good  skill  in  statistical 
forecasting  of  weather  elements  is  more  dependent  on  having 
good  predictors  and  information  than  on  model  type.  The 
addition  of  a  single  (perfect)  predictor,  c.oud  amount, 


Classification  index  (z) 


The  remarks  given  for  the  figures  in  cases  I  and  II  are  also 
applicable  here.  More  often  than  not,  only  one  of  a  pair  of 
thresholds  induced  by  differing  variances  will  be  of  real 
interest.  If  the  variances  of  the  two  groups  are  radically 
different,  then  both  members  of  the  threshold  pair  become 
important . 

4 .  The- Maximum- Like lihood-of- Detect ion  Criteria 

For  this  specific  model  the  following  background  is 
provided : 

event  space:  2  mutually  exclusive  populations 

tTq  ,  1 1  forecast  decision  space:  2  possible  forecasts 


d^  is  a  correct  forecast  if  tTq  actually  occurs 
d^  is  a  correct  forecast  if  tt^  actually  occurs 

Problem:  select  the  decision  rule  d(z)  which  maps 

the  observation  space  Z  into  some  forecast  space 
in  some  optimal  manner. 
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Z  may  be  an  observed  variable  or  it  may  be  an 
univariate  index  derived  from  a  number  of  variables. 

For  this  two  decision  problem,  Z  is  partitioned 
into  two  parts,  Z^  and  Z^. 

d(2)  =  dQ  if  2  ZQ 
d(2)  =  d^  if  2  e  Z 2 

where  Z^  n  =  0  and  Zq  u  =  Z 

The  maximum-likelihood-of-detection  criteria  repre¬ 
sents  the  simplest  decision  model.  The  basic  involves  select 
ing  the  forecast  (decision)  corresponding  to  the  observation 
(signal)  which  i's  the  most  likely  symptom  of  the  event  subse¬ 
quently  observed.  Consider  the  following  example: 

problem:  diagnose  disease  A  or  disease  B. 

The  observed  symptoms  occur  with  probability  0.75 
for  A  and  0.1  for  B.  By  the  maximum-likelihood-of-detect ion 
criteria  (MLDC ) ,  diagnose  disease  A  because  A  is  the  most 
likely  cause  of  the  observed  symptoms  (if  there  is  no  more 
information) .  But  if  we  know  that  A  is  rare  and  B  is  common, 
the  above  decision  may  not  be  optimal  and  MLDC  may  not  be 
appropriate.  MLDC  requires  only  that  we  know  the  event 
conditional  probability  density  functions  of  the  observations 


*  note  the  class  having  the  largest  variance  has  a 
bifurcated  decision  region. 


In  the  case  where  the  variances  are  equal,  the 
situation  simplifies  considerably. 


n  2 ,  2,-2  — 2 . 

2o  (z-j^  -Zq)  +  a  ( Zq  -z^)  >  0 


0 


2a2(z1  -zQ)  -  o2(z^  ~  z2)  <  0 


2  2  <  (Zg  +Z0: 


Z1  +  zo> 


•  -*1 
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A'o  2  A*, 

It  is  obvious  that  z*  is  simply  the  average  of  the 
means  of  the  class-conditional  distributions  and  is  found 
at  the  intersections  of  the  two  density  curves. 

In  the  foregoing,  normal  class  conditional  distribu¬ 
tions  were  assumed.  This  was  done  because  the  Gaussian  form 
admits  of  a  rather  clean  analytical  solution.  However,  the 
general  concept  of  the  minimum  probable  error  decision 
criteria  may  be  applied  to  any  form  of  density  function. 
Indeed,  the  density  function  of  one  group  need  not  even  be 
the  same  form  as  that  for  another  group  (one  might  be  exponen¬ 
tial  and  the  other  Gaussian) .  The  difficulty  with  most  non- 
Gaussian  forms  is  that  they  seldom  admit  of  closed  analytical 
forms  and  require  numerical  means  in  determination  of 
thresholds . 
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APPENDIX  B 


MEASURES  OF  SEPARABILITY 


As  the  testing  proceeded  through  progressive  time  stages 
in  the  study,  it  became  apparent  that  the  methods  were  unable 
to  separate  the  categories  of  scattered  and  broken  clouds, 
categories  I  and  II.  This  problem  required  the  investigation 
of  some  alternate  predictor  selection  schemes  to  improve  the 
ability  to  discriminate  between  these  categories. 

The  decision  information  for  discriminating  between  two 
categories  comes  from  two  sources:  the  separation  of  the  two 
means  and  the  difference  in  the  variances.  The  three  measures 
considered  in  this  study  are  the  Divergence,  the  Bhattacharya 
distance  and  the  Mahalanobis  dj stance.  These  three  measures 
attempt  to  combine  both  sources  of  information  to  come  up 
with  a  single  measure  of  the  ability  of  a  predictor  to  des¬ 
cribe  the  separation  in  the  categories  of  the  predictand. 

These  measures  are  applied  in  the  study  by  stratifying  each 
predictor  by  event  (i.e.,  predictand  category),  and  calcu¬ 
lating  the  mean  and  variance  of  the  stratified  predictors. 
Then,  for  each  predictor,  the  measures  of  separability  are 
calculated  for  category  I  versus  II,  category  I  versus  III, 
and  category  II  versus  III.  The  results  are  shown  in  tabular 
form  in  Tables  11  to  13. 

The  Mahalanobis  distance  considers  the  variances  as  equal 
and  uses  pooled  variances  of  the  predictor  in  the  following 
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univariate  form: 


(W 


1 


2 


It  can  be  thought  of  as  a  signal- to-noise  ratio  where  the 
difference  in  the  two  means  is  the  desired  signal,  and  the 
noise  is  the  scatter  within  the  whole  set  (the  variance) . 

The  Divergence  does  not  assume  equal  variance,  and, 
therefore,  does  not  use  a  pooled  variance.  It  adds  to  the 
signal-to-noise  ratio  two  quotients  of  the  variances  adjusted 
by  the  equal  variance  value  (two) .  It  has  the  effect  of 
combining  the  signal-to-noise  ratio  with  information  contained 
in  the  variances.  The  Divergence  is  used  in  this  study  in 
its  univariate  form: 


+ 


2) 


The  third  measure  of  separability  applied  to  the  data 
set,  the  Bhattacharyya  distance,  is  a  special  case  of  the 
Chernoff  distance.  Although  more  complicated  than  the 
Divergence,  it  also  combines  the  information  contained  in  the 
mean  with  that  found  in  the  variance.  The  Bhattacharyya  is 
used  in  the  study  in  its  univariate  form  as: 


(Ml  -U2) 


+  0 . 


|  1  n  ( • 


+  o . 


2o  1°2 
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through  XII,  for  homogeneous  area  2  at  the  time  period 
TAU-00 . 
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APPENDIX  C 


CLUSTER  ANALYSIS 

The  object  of  cluster  analysis  is  to  take  a  sample  of 
potential  predictor  variables  of  unknown  classification  and 
group  them  into  natural  classes  or  clusters.  The  fact  that 
there  is  no  a  priori  classification  of  the  sample  suggests 
that  cluster  analysis  is  fundamentally  a  tool  for  data 
exploration.  That  is  to  say,  one  wishes  to  study  the  data 
to  see  if  natural  and  useful  groupings  do,  in  fact,  exist. 

It  is  important  to  note  that  for  any  application  of  the  method 
there  are  many  possible  classifications  which  can  be  imposed 
on  a  sample.  Therefore,  the  sort  of  groupings  which  emerges 
from  an  analysis  will  depend  very  much  on  the  variables  used 
to  represent  the  predictand.  The  poor  choice  of  variables 
can  lead  to  a  clustering  which  is  useless  for  a  particular 
purpose  . 

The  clustering  done  for  cloud  amount  uses  the  BMDP  Sta¬ 
tistical  Software  (University  of  California,  1983)  P1M  program, 
applied  all  available  Model  Output  Parameters  (MOP's) .  The 
PlM  provides  four  measures  of  similarity  (association)  for 
clustering  variables  and  three  criteria  for  linking  or  com¬ 
bining  clusters.  Initially,  each  variable  is  considered  as 
a  separate  cluster;  then,  the  two  most  similar  variables  are 
joined  to  form  a  cluster.  The  amalgamating  process  continues 
in  a  stepwise  fashion  (joining  variabjes  or  clusters  of 
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variables)  until  a  single  cluster  is  formed  that  contains 
all  the  variables. 

As  used  in  this  study,  the  measure  of  similarity  is  the 
absolute  value  of  the  correlation.  The  similarity  measure 
could  also  be  obtained  from  a  measure  of  the  distance,  such 
as  the  angle  between  two  variables  (arccosine  of  the  corre¬ 
lation)  or  the  acute  angle  corresponding  to  the  arccosine 
of  the  absolute  value  of  the  correlation. 

The  linkage  rule  (the  criterion  for  combining  two 
clusters)  can  be  the  minimum  distance  (or  maximum  similarity) 
over  all  pairings  of  the  variables  between  the  two  clusters, 
the  maximum  distance  (or  minimum  similarity) ,  or  the  average 
distance  (or  similarity) .  The  average  similarity  is  the 
arithmetic  average  of  the  similarity  using  all  possible 
pairings  of  the  variables  between  the  two  clusters.  The 
maximum  similarity  (minimum  distance) ,  single  linkage  is  used 
for  the  MOP's  in  this  study. 

The  output  of  the  PlM  program  for  homogeneous  ocean  area 
2,  at  time  TAU-00,  is: 

Predictor  Clusters 

Cluster  Predictors 


1 . 

D1000  , 

D850,  D925,  D700 ,  D500 ,  D400,  D300,  D250 

2  . 

T500  , 

T400,  DDDP ,  T700 ,  T300,  TE700 

» 

3. 

VOR500 

,  VOR925,  DVRTDP 

4  . 

TAIR, 

T1000,  T9  2  5 

5. 

EAIR, 
E700  , 

E1000  ,  E850  ,  E925,  EPRD ,  TE925  , 

E50  0 ,  T250 

> 
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6  . 

PBLD , 

STRTTK 

RELH 

,  STRTFQ 

7. 

SMF , 

SHF 

8. 

BVLW , 

V850  , 

V925, 

V1000  , 

V700  , 

V50  0  , 

• 

V400  , 
UDVDZ 

V30 0  , 

V250  , 

VT250 , 

VE700 

,  VT700 , 

9. 

UBLW , 
U400  , 

U  8 50  , 
U300  , 

U92  5  , 
U250 

U1000  , 

U700  , 

U500  , 

• 

10  . 

DRAG, 

ETRNMT 

9 


» 


9 


APPENDIX  D 


NOGAPS  PREDICTOR  PARAMETERS  AVAILABLE  FOR  THE  NORTH 
ATLANTIC  OCEAN ,  15  MAY- 15  JULY  1983,  EXPERIMENTS 


ea:  Entire  North  Atlantic  Ocean  and  Mediterranean  Sea 


idel  output  time:  1200  GMT  (TAU-00) 


Model  output 
parameter 

D1000 

D9  2  5 

D850 

D700 

D500 

D400 

D300 

D250 

TAIR 

T1000 

T9  2  5 

T700 

T500 

T400 

T300 

T250 

EAIR 

E1000 

E925 

E850 

E700 

E500 

UBLW 

U1000 

U9  2  5 


Descriptive  name  of  parameter 


1000  mb  geopotential  height 

925  mb  geopotential  height 

850  mb  geopotential  height 

700  mb  geopotential  height 

500  mb  geopotential  height 

400  mb  geopotential  height 

300  mb  geopotential  height 

250  mb  geopotential  height 

Surface  air  temperature 

1000  mb  temperature 

925  mb  temperature 

700  mb  temperature 

500  mb  temperature 

400  mb  temperature 

300  mb  temperature 

250  mb  temperature 

Surface  vapor  pressure 

1000  mb  vapor  pressure 

925  mb  vapor  pressure 

850  mb  vapor  pressure 

700  mb  vapor  pressure 

500  mb  vapor  pressure 

Boundary  layer  zonal  wind  component 

1000  mb  zonal  wind  component 

925  mb  zonal  wind  component 
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U850 

U700 

U500 

U400 

U300 

U250 

VBLW 

ViOOO 

V9  2  5 

V850 

V700 

V500 

V4  0  0 

V300 

V2  50 

VOR9  2  5 

VOR500 

PS 

SMF 

PBLD 

STRTFQ 

STRTTH 

SHF 

EN'TRN 

DRAG 


850  mb  zonal  wind  component 

700  mb  zonal  wind  component 

500  mb  zonal  wind  component 

400  mb  zonal  wind  component 

300  mb  zonal  wind  component 

250  mb  zonal  wind  component 

Boundary  layer  meridional  wind 
component 

1000  mb  meridional  wind  component 

925  mb  meridional  wind  component 

850  mb  meridional  wind  component 

700  mb  meridional  wind  component 

500  mb  meridional  wind  component 

400  mb  meridional  wind  component 

300  mb  meridional  wind  component 

250  mb  meridional  wind  component 

925  mb  vorticity 

500  mb  vorticity 

Surface  pressure 

Surface  moisture  flux 

Planetary  boundary-layer  depth 

Percent  stratus  frequency 

Stratus  thickness 

Surface  heat  flux 

Entrainment  at  top  of  marine 
boundary- layer 

Drag  coefficient  (CD) 


Derived  parameters 

RELH  Surface  relative  humidity 

DVRTDP  Vertical  gradient  of  vorticity 

(VOR925  -  VOR500 ) 

EPRD  Product  of  vapor  pressures 

(E1000  ■' E 8 5 0  ) 

Height  thickness  ( D9 2 5-D2 50 ) /6 7 5 

Approximation  of  thermal  advection 
(V700  ■T700) 


DDDP 
VT  700 


L’DVDZ 

TE700 

VT250 

TE925 


Approximation  of  thermal  advection 
(U700* (V1000-V500 ) 

Product  of  temperature  and  vapor 
pressure* (T700  E700) 

Approximation  of  thermal  advection 
(T2 50  *V2 50 ) 

Product  of  temperature  and  vapor 
pressure  (T925*e925) 


a:  Entire  North  Atlantic  Ocean  and  Mediterranean  Sea 


el  output  time:  1200  GMT  (TAU-24  and  TAU-48) 

Parameters  available  and  derived  parameters  at  TAU-24  and 
-48  are  the  same  as  those  for  TAU-00  with  the  addition  of 
following  five  parameters: 


Model  output 


Descriptive  name  of  parameter 


PRECIP 

SHWRS 

INSTAB 

DIV925 

DIV500 


Total  amount  (mm.)  of  model  precipitation 
in  the  last  six  hours 

Total  amount  (mm.)  of  model  precipi¬ 
tation  associated  with  cumulus 
convection  in  the  last  six  hours 

Boundary  layer  inversion  instability 

925  mb  Divergence 

500  mb  Divergence 


i 
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AD-A155  100  AN  EVALUATION  OF  DISCRETIZED  CONDITIONAL  PROBABILITY 
AND  LINEAR  REGRESSIO.  .  (U)  NAVAL  POSTGRADUATE  SCHOOL 
MONTEREV  CA  M  H  WOOSTER  DEC  84 

UNCLASSIFIED  F/G  4/2 

r. 

■ 

j 

END 

1 

APPENDIX  E 


Total 

AO  = 

Al  = 

TSl  = 

TS2  = 

TS3  = 


VERIFICATION  SCORES,  DEFINITIONS 


3 

R 

s 

T 

< 

cr 

U 

V 

W 

u. 

1 

X 

Y 

z 

1 

3 

3 

OBSERVED 


=  R  +  S+  T  +  U  +  V  +  W  +  X  +  Y  +  Z 
percent  correct  =  (X+V+T) /Total 

one-class  error  =  ( U+S+Y-f-W)  /Total 

Threat  score  for  category  I  =  X/ (R+U+X+Y+Z) 

Threat  score  for  category  II  =  V/  («U+V+W+S+Y) 

Threat  score  for  category  III  =  T/ ( R+S+T+W+Z) 
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APPENDIX  F 


BMDP  LINEAR  REGRESSION  EQUATION  PREDICTOR  SETS. 

NORTH  ATLANTIC  OCEAN  (PR+BMD) 

These  are  the  derived  linear  regression  equations  used 
as  additional  predictors  in  the  PR+BMD  model.  The  BMD  value 
of  each  equation  represents  an  estimate  of  the  category 
predictand . 

I.  Area  4,  TAU-00 ,  Cloud  amount 

BMDl  =  1.87764  +  0 . 57546E-07xU850  +  0.372xE700 

-  0.4595xT500  -  0 . 00 837 xSTRFQ  -  9640 . 3555xVOR500 
+  28687. 457xVOR925 

BMD2  =  -0.293341  -  0.257147xTX  +  0 . 0008191xUBLW 

-0.055399xT500  -  3345 . 81xVOR500 
+8551.9xVOR925  -  0 . 002537xEPRD 

II.  Area  2,  TAU-00,  Cloud  amount 

BMDl  =  2.05292  -  0.09055xEAIR  +  0 . 19066E-03xUBLW 

-  5335 . 984 3 8xVOR50Q  +  7 4 74 . 70 7x VOR9 25 
+  0.00505xEPRD  +  0 . 78387E-07xUDVDZ 

BMD2  =  2.51018  -  0 . 28119E-03xU700  +  0.31987xE500 

+  1.73035xDVRTDP  +  0 . 27946E-04 xU700 
-0.2993E-05xVT250  +  0 . 00236xTE925 


Area  2,  TAU-24,  Cloud  amount 

BMDl  =  2.98984  -  0.11211*EAIR  +  0.88063xD700 

-  0.01415xSHF  -  8 . 28037xDIV925 

-  2.27441 x V0R9 2 5  +  0.00656xEPRD 

BMD2  =  1.95832  -  0.05608xTAIR  -  0 . 0 84 347 xEAIR 

-  0.01297xSMF  +  0 . 1273 3xDIV925 

+  0 . 8508E-05xUl000  +  0 . 19 27 68 xD700 

Area  2,  TAU-48,  Cloud  amount 

BMDl  =  1.5808  +  0 . 35787E-03xps  -  0.00998xEAIR 

-  0.01438xv850  +  0 . 14 885E-03xD500 
+  0.0412xV500  +  0.01165xTE700 

BMD2  =  2.45617  -  0.5245xEAIR  +  0.15573xE500 

+  0.06068xE925  -  0 . 2383E-0 3 xSTRFQ 
+  0.0837xSTRTK  -  6 . 19 80 8xDIV9 25 

Area  2,  TAU-00,  Ceiling 

BMDl  =  2.56681  -  0.03478xE850  +  0 . 1 884 xe-03xV700 

-  0.03513xT925  +  0 . 4 29 4E-03 xDRAG 

-  2759 . 3096xVOR500  -  0 . 63253E-04xVE700 

BMD2  =  3.78741  -  0 . 1 59 39 xCLAMT  -  0 . 59 71E-04 xUBLW 

+  0.1187E-03xV700  -  0.06054xE925 

-  2607/6801 xVOR500  -  0 . 4057E-04xVE700 


APPENDIX  G 


BMDP  LINEAR  REGRESSION  EQUATION  PREDICTOR  SETS. 

NORTH  ATLANTIC  FOR  REGRESSION  MODELS 

These  are  the  derived  linear  regression  equations  used 
in  the  one  and  two  stage  regression  models.  The  BMD  value 
of  each  equation  represents  an  estimate  of  the  category 
predictand. 

I.  Area  4,  TAU-00,  Cloud  amount 

a.  Two  stage  regression 

VI  =  0.763701  -  0 . 14550  6*TX  -  0.004051><SHF 

+  0 .13055xT1000  +  0 . 000208xUBLW 

-  0.0001745xU1000  +  0 .073322xE850 

V2  =  0.290741  -  0.12764 xTX  -  0.010425xSHF 

+  0.00017xUBLW  +  0 .11457xT1000 

-  0.000761xUl000  +  0.0792243xE850 

b.  Single  stage  regression 

VI  =  -  0.29334  -  0.025715x TX  +  0 . 000 81 91 xUBLW 

-  0 .055399xT500  -  334 5 . 81xVOR500 
+  8551 . 8xVOR925  -  0 . 00 2537xEPRD 

II.  Area  2,  TAU-00,  Cloud  amount 
a.  Single  stage  regression 

VI  =  2.05292  -  0.09055xEAIR  +  0 . 19066E-03xUBLW 

-  5335.9844xVOR500  +  7474 . 707xVOR925 
+  0.00  50  5  xEPRD  +  0 . 783 87E-7 xUDVDZ 
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b.  Single  stage  regression  Separation  Test  1 


VI  =  1.87611  +  0 .98222E-04xUBLW  -  0 . 22933E-04xU1000 

-  0 . 66081xE-04xD850  +  . 58 44E-0 4 xD700 

-  0 . 02909 xSHF  +  4128 . 51953xVOR925 

c.  Single  stage  regression  Separation  Test  2 

VI  =  1.73895  +  0 . 0  572  8xE850  +  0.19578xE700 

-  0.04394xi.500  +  1 . 56723xDVRTDP 

+  0 . 21519E-04x VE700  -  0 . 00557xTE925 

d.  Single  stage  regression  Separation  &  Cluster 

VI  =  2.56562  -  0 .75713E-03xPS  -  0.05658xEAIR 

+  0 .49973E-04xU1000  +  0.07972xT925 
+  0.00931xSTRTTK  +  4624 . 29688xVOR925 
+  0 .20189E-04xVT700  +  0 . 00127xTE700 

e.  Two  stage  regression  separation  test 

VI  =  1.3457  +  0 ,43966E-04xUBLW  +  0 . 5055E-05xDl000 

+  0.64366E-04xUl000  -  0 . 61 831E-0 5xD850 
+  0.00285xSHF  +  2488 . 089 84xVOR925 

V2  =  2.3040  -  0.01869xE850  +  0.13171xE700 

-  0.06215xE500  +  1 . 46893xDVRTDP 

+  0 .39909e-05xVE700  -  0 . 2294e-04xTE925 


III.  Area  2,  TAU-24,  Cloud  Amount 
a.  Single  stage  regression 

VI  =  2.98984  -  0.11211xEAIR  +  0.88063xD700 

-  0.01415xSHF  -  8. 28037xDIV925 

-  2.27441xVOR925  +  0.00656xEPRD 


[«i] 


IV.  Area  2,  TAU-48,  Cloud  Amount 
a.  Single  stage  regression 

VI  =  2.45617  -  0.05245xEAIR  +  0.15573*E500 

+  0.06068xE925  -  0 . 23 83E-0 3 xSTRFQ 
+  0 .00837xSTRTK  -  6 . 19809xDIV925 

V.  Area  2,  TAU-00,  Ceiling 

a.  Single  stage  regression — no  cloud  amount  variable 

VI  =  2.56681  -  0.03478xE950  +  0 . 18836E-03xV700 

-  0.03513xT925  +  0 .4294E-03xDRAG 

-  2759.3096xVOR500  -  0 . 63253E-04 xVE700 

b.  Single  stage  regression  with  cloud  amount  variable 

VI  =  3.78741  -  0.15939xCLAMT  -  0 . 59 71E-04 xUBLW 

+  0.1187E-03xV700  -  0.06054xE925 

-  2607/6801XVQR500  -  0 . 4057E-04xVE700 


APPENDIX  H 


TABLES 


TABLE  I 

A  summary  of  1200  GMT  cloud  amount 
observations,  15  May  to  07  July  1983, 
North  Atlantic  Ocean  homogeneous  areas 
as  shown  in  Fig.  1:  TAU-00 


Total  CAT  I  CAT  II  CAT  III 


11428 


4022 


4485 


2921 


TABLE  II 


A  summary  of 

1200  GMT 

cloud  amount 

observations 

,  15  May 

to  07  July  1983, 

North  Atlantic  Ocean  homogeneous  areas 

as  shown  in 

Fig.  1: 

TAU-24 

Area 

Total 

CAT  I 

CAT  II 

CAT  III 

All 

9416 

3378 

3616 

2422 

;  ;  < 

(  .36) 

(.38) 

(.26) 

'  -  ■  . 

1 

1460 

281 

583 

596 

- 

(  .19) 

(.40) 

(  .41) 

j 

2 

1422 

290 

550 

582 

(  .20) 

(.39) 

(.41) 

.  4 

3 

916 

259 

298 

359 

*>>> 

(.28) 

(.33) 

(  .39) 

4 

2592 

857 

988 

747 

(  .33) 

(.38) 

(.29) 

*  4 

5 

1992 

1050 

719 

223 

(  .53) 

(.36) 

(  .11) 

.  ■* . 

;-4 

6 

1684 

690 

652 

342 

(  .41) 

(  .39) 

(  .20) 

;’;V 

7 

458 

140 

207 

111 

.**•/ 

(  .31) 

(  .45) 

(  .24) 

8 

874 

457 

351 

66 

(  .52) 

(  .40) 

(  .08) 

•  v.‘ 
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TABLE  III 


A  summary  of 

1200  GMT  cloud  amount 

observations 

,  15  May 

to  07  July  1983, 

North  Atlantic  Ocean 

homogeneous  areas 

as  shown  in 

Fig.  1: 

TAU-48 

Area 

Total 

CAT  I 

CAT  II 

CAT  III 

All 

10775 

3817 

4150 

2808 

(.35) 

(.39) 

(  .26) 

1 

1676 

339 

691 

646 

(.20) 

(.41) 

(  .39) 

2 

1644 

327 

656 

661 

(.20) 

(.40) 

(  .40) 

3 

1046 

308 

336 

402 

(.29) 

(.32) 

(  .38) 

4 

2976 

947 

1145 

884 

(.32) 

(  .38) 

(.30) 

5 

2264 

1166 

823 

275 

(  .52) 

(.36) 

(  .12) 

6 

1949 

804 

753 

392 

(.41) 

(.39) 

(  .20) 

7 

524 

158 

234 

132 

(.30) 

(.45) 

(  .25) 

8 

976 

505 

382 

89 

(  .52) 

(  .39) 

(  .09) 
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CEPENDENT  DATA 
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Figure  5. 


Contingency  table  results  for  the 
area  4,  TAU-00,  two-stage  regression 
QUAD  model  for  cloud  amount 


SIGNIFICANCE  TEST 

Null  Hypothesis  Contingency 
Table  (Chance) 


A0t%)' 33.36 
TSl  .20 
TS2  :  .22 
TS3  ,18 

'  2  3 

OflSMvtO 

Area  4  TAUOO 

95%  CONFIDENCE  INTERVAL 

A0(%)  30.53  -  36. 19 

TSl  .  1  7  -  .2  2 
TS2  .20  -  .2  5 
TS3  .15  -20 

Figure  3.  Confidence  intervals  for  significance 
with  respect  to  chance--area  4, 
TAU-00,  cloud  amount 
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TABLE  XII  (Continued) 


VARIABLE 

BHATTACHARYYA 

DIVERGENCE 

MAHALANOBIS 

T300 

0.21056 

0.22100 

0.20881 

U300 

0.01408 

0.02348 

0.01272 

V300 

0.07071 

0.07155 

0.07062 

D925 

0.00771 

0.00836 

0.00762 

T925 

0.20461 

0.21458 

0.20294 

E925 

0.20111 

0.30206 

0.18641 

U925 

0.00317 

0.01440 

0.00158 

V925 

0.12695 

0.13894 

0.12507 

D250 

0.19249 

0.19258 

0.19250 

T250 

0.01740 

0.02707 

0.01600 

U250 

0.01320 

0.02348 

0.01172 

V250 

0.06705 

0.06854 

0.06687 

PBLD 

0.01446 

0.01503 

0.01438 

STFQ 

0.06272 

0.06298 

0.06267 

STSK 

0.01442 

0.01972 

0.01367 

SHF 

0.36083 

0.36084 

0.36084 

ETRN 

0.01873 

0.09273 

0.00844 

DRAG 

0.08309 

0.22548 

0.06334 

VOR5 

0.13266 

0.13638 

0.13223 

VOR9 

0.00003 

0.00005 

0.00003 

RHSU 

0.01062 

0.02140 

0.00910 

DDDP 

0.27874 

0.28099 

0.27825 

DVRT 

0.35180 

0 . 36540 

0.34936 

EPRD 

0.21446 

0 .34633 

0.19553 

VT70 

0.02794 

0.21941 

0.00217 

VE70 

0 . 17673 

0.46591 

0 .13800 

UDVZ 

0.00548 

0.03222 

0.00169 

TE70 

0.18493 

0.32513 

0.16501 

VT2  5 

0.06871 

0.07051 

0.06849 

TE92 

0 .24765 

0.39902 

0.22596 

RH  50 

0 . 12367 

0 . 12393 

0.12361 
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TABLE  XII 


Listing  of  the  measures  of  separability, 
by  predictor,  for  cloud  amount  categories 
II  versus  III,  North  Atlantic  Ocean 
homogeneous  area  2  at  time  period  TAU-00 


VARIABLE 

BH ATTACH ARY Y A 

DIVERGENCE 

MAHALANOBIS 

PS 

0.00117 

0.00218 

0.00103 

TX 

0.12979 

0.13784 

0.12850 

EX 

0.07575 

0.09323 

0.07315 

SMF 

0.20953 

0.22308 

0.20791 

UBLW 

0.00386 

0.02460 

0.00092 

VBLW 

0.12082 

0.14304 

0.11745 

D100 

0.00137 

0.00229 

0.00124 

T100 

0.12849 

0.13413 

0.12757 

E100 

0.10728 

0.14164 

0.10219 

U100 

0.00279 

0.01875 

0.00052 

V100 

0.10973 

0.13279 

0.10625 

D850 

0.02216 

0.02261 

0.02210 

E850 

fl. 30505 

0.44064 

0.028520 

U850 

0.00403 

0.01390 

0.00262 

V850 

0.12095 

0.12695 

0.11997 

D700 

0.07096 

0.07210 

0.07083 

T700 

0.31687 

0.32132 

0.31640 

E700 

0.50219 

0.59661 

0.48725 

U700 

0.00688 

0.01573 

0.00561 

V700 

0.08627 

0.08864 

0.08588 

D500 

0.14117 

0.14293 

0.14100 

T500 

0.22404 

0.22409 

0.24405 

E500 

0.34663 

0.41264 

0.33632 

U500 

0.01009 

0.01637 

0.00919 

V500 

0.07677 

0.07727 

0.07667 

D400 

0.16929 

0.17051 

0.16919 

T400 

0.22131 

0.22215 

0.22110 

U400 

0.01272 

0  .01996 

0.01168 

V400 

0.07573 

0.07575 

0.07573 

D300 

0.18786 

0 .18828 

0.18785 

TABLE  XI  (Continued) 


VARIABLE 

BHATTACHARYYA 

DIVERGENCE 

MAHALANOBIS 

T300 

0.20562 

0.24049 

0.19159 

U300 

0.03037 

0.03054 

0.03045 

V300 

0.16807 

0.16866 

0.16694 

D925 

0.04597 

0.04739 

0.04533 

T925 

0.06919 

0.07071 

0.06829 

E925 

0.14805 

0.25984 

0.12245 

U925 

0.06896 

0.09353 

0.06296 

V925 

0.21093 

0.26951 

0.19104 

D250 

0.04663 

0.04671 

0.04651 

T250 

0.10538 

0.12246 

0.09963 

U250 

0.02878 

0.02899 

0.02886 

V250 

0.16161 

0.16237 

0.16037 

PBLD 

0.09085 

0.09125 

0.09032 

STFQ 

0.15207 

0.15208 

0.15220 

STSK 

0.09857 

0. -09902 

0.09797 

SHF 

0.28824 

0.35556 

0.26172 

ETRN 
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Listing  of  the  measures  of  separability. 

1 

by  predictor, 

for  cloud  amount  categories 
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Figure  6. 


Contingency  table  results  for  the 
area  4,  TAU-00,  two-stage  regression 
MLDC  model  for  cloud  amount 
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TAU-00,  cloud  amount 
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EVAR  model  for  cloud  amount 
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AO  achieved  at  the  second  predictor. 
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AO  achieved  at  the  third  predictor. 
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